Optimizing the integration and energy efficiency of through...

149
Microelectronics System Design Research Group School of Electrical, Electronic & Computer Engineering Optimizing the integration and energy efficiency of through silicon via-based 3D interconnects Panagiotis Asimakopoulos Technical Report Series NCL-EECE-MSD-TR-2011-173 November 2011

Transcript of Optimizing the integration and energy efficiency of through...

Page 1: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

Microelectronics System Design Research Group

School of Electrical Electronic amp Computer Engineering

Optimizing the integration and energy efficiencyof through silicon via-based 3D interconnects

Panagiotis Asimakopoulos

Technical Report Series

NCL-EECE-MSD-TR-2011-173

November 2011

Contact panagiotisasimakopoulosn13la13uk

NCL-EECE-MSD-TR-2011-173

Copyright ccopy 2011 Newcastle University

Microelectronics System Design Research Group

School of Electrical Electronic amp Computer Engineering

Merz Court

Newcastle University

Newcastle upon Tyne NE1 7RU UKhttpasyn13orguk

O P T I M I Z I N G T H E I N T E G R AT I O N A N DE N E R G Y E F F I C I E N C Y O F T H R O U G H

S I L I C O N V I A - B A S E D 3 D I N T E R C O N N E C T S

panagiotis asimakopoulos

A Thesis Submitted for the Degree of Doctor of Philosophyat Newcastle University

School of Electrical Electronic and Computer Engineering

Newcastle upon Tyne

November 2011

Panagiotis Asimakopoulos Optimizing the integration and energy

efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011

A B S T R A C T

The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration

As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology

Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process

iii

P U B L I C AT I O N S

Some ideas and figures have appeared previously in the followingpublications

1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper

award)

2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-

makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010

3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010

4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009

5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008

iv

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 2: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

Contact panagiotisasimakopoulosn13la13uk

NCL-EECE-MSD-TR-2011-173

Copyright ccopy 2011 Newcastle University

Microelectronics System Design Research Group

School of Electrical Electronic amp Computer Engineering

Merz Court

Newcastle University

Newcastle upon Tyne NE1 7RU UKhttpasyn13orguk

O P T I M I Z I N G T H E I N T E G R AT I O N A N DE N E R G Y E F F I C I E N C Y O F T H R O U G H

S I L I C O N V I A - B A S E D 3 D I N T E R C O N N E C T S

panagiotis asimakopoulos

A Thesis Submitted for the Degree of Doctor of Philosophyat Newcastle University

School of Electrical Electronic and Computer Engineering

Newcastle upon Tyne

November 2011

Panagiotis Asimakopoulos Optimizing the integration and energy

efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011

A B S T R A C T

The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration

As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology

Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process

iii

P U B L I C AT I O N S

Some ideas and figures have appeared previously in the followingpublications

1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper

award)

2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-

makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010

3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010

4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009

5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008

iv

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 3: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

O P T I M I Z I N G T H E I N T E G R AT I O N A N DE N E R G Y E F F I C I E N C Y O F T H R O U G H

S I L I C O N V I A - B A S E D 3 D I N T E R C O N N E C T S

panagiotis asimakopoulos

A Thesis Submitted for the Degree of Doctor of Philosophyat Newcastle University

School of Electrical Electronic and Computer Engineering

Newcastle upon Tyne

November 2011

Panagiotis Asimakopoulos Optimizing the integration and energy

efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011

A B S T R A C T

The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration

As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology

Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process

iii

P U B L I C AT I O N S

Some ideas and figures have appeared previously in the followingpublications

1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper

award)

2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-

makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010

3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010

4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009

5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008

iv

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 4: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

Panagiotis Asimakopoulos Optimizing the integration and energy

efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011

A B S T R A C T

The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration

As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology

Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process

iii

P U B L I C AT I O N S

Some ideas and figures have appeared previously in the followingpublications

1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper

award)

2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-

makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010

3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010

4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009

5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008

iv

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 5: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

A B S T R A C T

The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration

As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology

Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process

iii

P U B L I C AT I O N S

Some ideas and figures have appeared previously in the followingpublications

1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper

award)

2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-

makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010

3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010

4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009

5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008

iv

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 6: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

P U B L I C AT I O N S

Some ideas and figures have appeared previously in the followingpublications

1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper

award)

2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-

makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010

3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010

4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009

5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008

iv

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 7: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

In the end I hope therersquos a little note somewhere

that says I designed a good computer

mdash Steve Wozniak [79]

A C K N O W L E D G E M E N T S

First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts

As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm

Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec

Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools

v

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 8: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

C O N T E N T S

1 Introduction 1

11 Research goals and contribution 2

12 Thesis outline 3

2 Background 5

21 3D integrated circuits 5

211 3D integration approaches 6

212 TSV interconnects 9

22 Energy-recovery logic 13

221 Energy-recovery architectures 17

222 Power-clock generators 18

23 Full-custom design flow 20

24 Summary 21

3 Characterization of TSV capacitance 23

31 Introduction 23

32 TSV parasitic capacitance 23

33 Integrated capacitance measurement techniques 24

34 Electrical characterization 27

341 Physical implementation 28

342 Measurement setup 36

343 Experimental results 37

35 Summary and conclusions 44

4 TSV proximity impact on MOSFET performance 46

41 Introduction 46

42 TSV-induced thermomechanical stress 46

43 Test structure implementation 48

431 MOSFET arrays 51

432 Digital selection logic 55

44 Experimental measurements 61

441 Data processing methodology 61

442 Measurement setup 62

443 PMOS (long channel) 64

444 PMOS (short channel) 71

45 Summary and conclusions 73

5 Evaluation of energy-recovery for TSV interconnects 75

51 Introduction 75

52 Energy-recovery TSV interconnects 75

53 Analysis 79

531 Adiabatic driver 80

532 Inductorrsquos parasitic resistance 82

533 Switch M1 82

534 Timing Constraints 83

54 Simulations 83

541 Comparison to CMOS 86

vi

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 9: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate

contents vii

55 Summary and conclusions 91

6 Energy-recovery 3D-IC demonstrator 92

61 Introduction 92

62 Physical implementation 92

621 Adiabatic driver 94

622 Pulse-to-level converter 95

623 Data generator 95

624 Pulse generator 95

625 Integrated inductor 97

626 Top level design 98

627 CMOS IO circuit 101

63 Post-layout simulation and comparison 101

64 Experimental measurements 106

641 Measurement setup 106

642 Results 106

65 Summary and conclusions 111

7 Conclusions 114

71 Main contributions 115

72 Future work 116

a Appendix A Photos 118

b Appendix B Statistics 120

bibliography 122

L I S T O F F I G U R E S

Figure 11 Scaling trend of logic and interconnect delay[30] 2

Figure 21 Comparison of 2D3D Integration 5

Figure 22 System-in-a-package 6

Figure 23 Package-on-package 6

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7

Figure 25 3D wafer-level-packaging 7

Figure 26 3D-stacked IC 8

Figure 27 TSV process steps 9

Figure 28 Before thinningafter thinning 10

Figure 29 TSV stacking 10

Figure 210 Conventional AND gate 14

Figure 211 CMOS switching 16

Figure 212 Adiabatic switching 16

Figure 213 Basic 2N-2P adiabatic circuit 17

Figure 214 2N-2P timing 17

Figure 215 Adiabatic driver 18

Figure 216 Stepwise charging generator 19

Figure 217 1N power clock generator 20

Figure 218 Design flow 22

Figure 219 Inverter in Virtuoso schematic 22

Figure 220 Inverter in Virtuoso layout 22

Figure 31 Isolated 2D wire capacitance 24

Figure 32 TSV parasitic capacitance 25

Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25

Figure 34 CBCM pseudo-inverters 27

Figure 35 Current as a function of frequency in CBCM 28

Figure 36 CBCM pseudo-inverter pair in layout 29

Figure 37 PRC1 test pad module (layout) 30

Figure 38 PRC1 test pad module (micrograph) 30

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34

Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35

viii

List of Figures ix

Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35

Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36

Figure 316 Measurement setup 36

Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39

Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40

Figure 319 TSV capacitance measurement results onwafer D11 40

Figure 320 TSV capacitance measurement results onwafer D18 41

Figure 321 Simulated oxide liner thickness variation 42

Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42

Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44

Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44

Figure 41 TSV thermomechanical stress simulation [40] 47

Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48

Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49

Figure 44 PTP1 test pad module (layout) 50

Figure 45 PTP1 test pad module (micrograph) 50

Figure 46 MOSFET dimensions and rotations 51

Figure 47 1 TSV MOSFET array configuration 52

Figure 48 4 diagonal TSVs MOSFET array configuration52

Figure 49 4 lateral TSVs MOSFET array configuration 53

Figure 410 9 TSVs MOSFET array configuration 53

Figure 411 MOSFET array detail 55

Figure 412 Digital selection logic 56

Figure 413 MOSFET array with digital selection logic 57

Figure 414 4-bit Gate-enableArray-enable decoder 58

Figure 415 Transmission gate for MOSFET Gate terminals59

Figure 416 Transmission gates for MOSFET Drain-Source terminals 59

Figure 417 2-bit Drain-select decoder 60

Figure 418 Processing of measurement results 62

Figure 419 Interpolation of measurement results 63

Figure 420 Measurement setup 63

Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65

List of Figures x

Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66

Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66

Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67

Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67

Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68

Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69

Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69

Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70

Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70

Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71

Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71

Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72

Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72

Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76

Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77

Figure 53 Adiabatic driver 78

Figure 54 A simple pulse-to-level converter (P2LC) 78

Figure 55 Resonant pulse generator with dual-rail dataencoding 79

Figure 56 Energy-recovery scheme for TSV interconnects79

Figure 57 Energy-recovery timing constraints 83

Figure 58 Test case for the evaluation of the theoret-ical model 84

Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85

Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85

Figure 511 Test case for comparison of energy-recoveryto CMOS 86

Figure 512 Improved P2LC design 87

List of Figures xi

Figure 513 Simulated energy dissipation of improvedP2LC design 87

Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88

Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88

Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89

Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =

80 f F f = 500MHz) 89

Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90

Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90

Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93

Figure 62 Energy-recovery IO circuit schematic dia-gram 94

Figure 63 Adiabatic driver (layoutschematic view) 94

Figure 64 P2LC (layoutschematic view) 95

Figure 65 Data generator (layoutschematic view) 96

Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96

Figure 67 Pulse generator (layoutschematic view) 97

Figure 68 Pulse generator InputOutput signals 97

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98

Figure 610 Planar square spiral inductor designed inASITIC 99

Figure 611 Resonant pulse distribution tree 100

Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101

Figure 613 Energy-recovery IO circuit in layout view 102

Figure 614 CMOS IO circuit schematic diagram 103

Figure 615 CMOS driver schematic 103

Figure 616 CMOS IO circuit in layout view 103

Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105

Figure 618 Energy-recovery IO circuit micrograph 107

List of Figures xii

Figure 619 CMOS IO circuit micrograph 107

Figure 620 Measurement setup 107

Figure 621 Oscilloscope output for signals S1B and OC-MOS 109

Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109

Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110

Figure 624 Oscilloscope output for signals S1A and OER110

Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111

Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111

Figure 627 Simulation of circuit with fault hypothesis 112

Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113

Figure A1 3D65 300mm wafer (on chuck) 118

Figure A2 Probe card 118

Figure A3 3D130 200mm wafer (on chuck) 119

Figure A4 Probe head 119

L I S T O F TA B L E S

Table 21 3D interconnect technologies comparison 8

Table 22 Technology parameters for Imec 3D130 process11

Table 23 Technology parameters for Imec 3D65 process11

Table 24 Comparison energy-recoveryconventionalCMOS 16

Table 31 Test structures on the PRC1 module 31

Table 32 PRC1 test pad instrument connections 37

Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38

Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39

Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43

Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45

Table 41 Coefficients of thermal expansion 47

Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54

Table 43 MOSFET array configurations selected forexperimental evaluation 61

Table 44 PTP1PTP2 test pad module instrumentconnections 64

Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73

Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84

Table 61 Energy-recovery IO circuit parameters 93

Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100

Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104

Table 64 Post-layout simulated energy dissipation Area requirements 106

Table 65 Energy-recovery circuit instrument connec-tions 108

Table 66 CMOS circuit instrument connections 108

xiii

A C R O N Y M S

2D two-dimentional

2N-2P 2 nMOS2 pMOS adiabatic logic

3D-IC three-dimentional integrated circuit

3D-SIC 3D stacked-IC

3D-SIP 3D system-in-package

3D three-dimentional

3D-WLP 3D wafer-level-packaging

BCB benzocyclobutene

BEOL back-end-of-line

CAL clocked adiabatic logic

CBCM charge-based capacitance measurement

CMOS complementary metal-oxide-semiconductor

CMP chemical-mechanical planarization

CTE coefficient of thermal expansion

Cu copper

CVD chemical vapor deposition

D2D die-to-die

D2W die-to-wafer

DCVSL differential cascode voltage switch logic

DRC design rule check

DUT device under test

ECRL efficient charge recovery logic

EDA electronic design automation

FEM finite element method

FEOL front-end-of-line

GPIB general purpose interface bus

IC integrated circuit

xiv

acronyms xv

IO inputoutput

KOZ keep-out-zone

LC inductance-capacitance

LVS layout versus schematic

MEMS microelectromechanical systems

MIM metal-insulator-metal

MOSFET metal-oxide-semiconductor field-effect transistor

MOS metal-oxide-semiconductor

MPU microprocessor unit

MuGFET multiple gate field-effect transistor

NMOS n-channel MOSFET

P2LC pulse-to-level converter

PAL pass-transistor adiabatic logic

PCB printed circuit board

PCG power-clock generator

PMD pre-metal dielectric

PMOS p-channel MOSFET

PVT process voltage temperature

QSERL quasi-static energy recovery logic

RC resistance-capacitance

RDL re-distribution layer

RLC resistance-inductance-capacitance

RMS root mean square

SIP system-in-a-package

Si silicon

SMU source measurement unit

SoC system-on-chip

TaN tantalum nitride

TSV through-silicon via

W tungsten

acronyms xvi

W2W wafer-to-wafer

WLP wafer-level-packaging

1I N T R O D U C T I O N

Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965

chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]

As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects

In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)

The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and

1 International Technology Roadmap for Semiconductors

1

11 research goals and contribution 2

Figure 11 Scaling trend of logic and interconnect delay [30]

innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]

Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D

interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density

11 research goals and contribution

As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-

ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]

The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future

12 thesis outline 3

3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection

The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution

The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs

are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process

The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements

12 thesis outline

This thesis is organised into seven Chapters and two Appendixesas follows

2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)

12 thesis outline 4

Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described

Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods

Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process

Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design

Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements

Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work

Appendix A includes photos used as reference in various partsof this work

Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data

2B A C K G R O U N D

The work presented in this thesis contributes in the fields of 3D

integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters

21 3d integrated circuits

CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package

There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy

Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of

ABCDEFBA ABCDEFBA

Figure 21 Comparison of 2D3D Integration

5

21 3d integrated circuits 6

ABCDEFBC

E

Figure 22 System-in-a-package

ABCDEFBC

E

Figure 23 Package-on-package

a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities

The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor

211 3D integration approaches

3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure

bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages

bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)

bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion

The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)

3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps

21 3d integrated circuits 7

Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]

A

BCDEAFE

FFDC

CFE

Figure 25 3D wafer-level-packaging

take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]

3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]

The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-

SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost

A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21

21 3d integrated circuits 8

ABCD

EFFDB

F

Figure 26 3D-stacked IC

3D interconnect

technologyAdvantages Disadvantages

3D system-in-package (3D-SIP)

Simple existingtechnology bestheterogeneous

integration

Performance(speed power) low

densitymanufacturing costin high-quantities

3D wafer-level-packaging (3D-

WLP)

Existingtechnology good

compromisebetween

performance-manufacturing

cost

Averageperformance and

density

3D stacked-IC (3D-SIC)

Performance(speed power)

high densitymanufacturing costin high-quantities

Manufacturing costin low-quantities

not ready formanufacturing

Table 21 3D interconnect technologies comparison

21 3d integrated circuits 9

ABBCDE

FBACACDA

ACCD F

Figure 27 TSV process steps

212 TSV interconnects

Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection

In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie

Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively

As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]

21 3d integrated circuits 10

AAAB

CAAAB

ABC

DEF

Figure 28 Before thinningafter thinning

ABC ABC

DE DE

F

DEE

B

CBA

Figure 29 TSV stacking

21 3d integrated circuits 11

Minimum devicechannel length (nm)

130

Supply voltage (V) 12

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 22

TSV resistance (mΩ) ~20

TSV capacitance( f F)

~40

Wafer diameter(mm)

200

Table 22 Technology paramet-ers for Imec 3D130

process

Minimum devicechannel length (nm)

70

Supply voltage (V) 10

Routing layers 2

TSV diameter (microm) 5

TSV pitch (microm) 10

TSV length (microm) 40

TSV resistance (mΩ) ~40

TSV capacitance( f F)

~80

Wafer diameter(mm)

300

Table 23 Technology paramet-ers for Imec 3D65 pro-cess

Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing

Design tools

Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs

Thermomechanical stress

TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur

21 3d integrated circuits 12

The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance

Thermal analysis

The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink

Electrical characteristics

The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]

Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV

structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process

22 energy-recovery logic 13

Testing

Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems

22 energy-recovery logic

TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology

1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections

2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them

Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that

22 energy-recovery logic 14

Figure 210 Conventional AND gate

were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6

Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded

Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output

Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to

22 energy-recovery logic 15

perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]

In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]

The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =

CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]

E = QVDD = CV2DD (21)

From the total transferred energy only half ( 12 CV2

DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C

The adiabatic charging of an equivalent node capacitance C

is modelled in Figure 212 In contrast to the conventional CMOS

circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to

EAdiabatic prop

(

RC

T

)

CV2DD (22)

The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters

22 energy-recovery logic 16

A

B

CC

D

E

D

Figure 211 CMOS switching

AA

B

C

B

Figure 212 Adiabatic switching

Conventional CMOS Energy-recovery CMOS

2 states True False 3 states True False Off

Energy of information bitdissipated as heat on the

pull-down network

Energy of information bitrecovered by the

power-supply

High speed Lowmedium speed

High energy dissipation Very low energy dissipation

Area efficient Some area overhead

Table 24 Comparison energy-recoveryconventional CMOS

Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]

A comparison between energy-recovery and conventional CMOSis summarised in Table 24

22 energy-recovery logic 17

ABCABC

Figure 213 Basic 2N-2P adiabaticcircuit

ABC

ABC

DEF

EE

DEF

Figure 214 2N-2P timing

221 Energy-recovery architectures

Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency

A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P

circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern

The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel

22 energy-recovery logic 18

Figure 215 Adiabatic driver

MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network

The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]

The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle

Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]

222 Power-clock generators

Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and

22 energy-recovery logic 19

A

A

B

C

C

Figure 216 Stepwise charging generator

recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators

Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow

Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at

f =1

2πradic

LCLOAD

(23)

As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is

23 full-custom design flow 20

A

BCD

E

DFF

CD

Figure 217 1N power clock generator

partially recovered and stored in the inductor for later use RLC

oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip

23 full-custom design flow

All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal

In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available

The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case

1 Cadence Design Systems Inc (httpwwwcadencecom)

24 summary 21

the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step

Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics

24 summary

In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process

2 Mentor Graphics Inc (httpwwwmentorcom)

24 summary 22

ABCDEA

FCCAAE

ABCDEAEC

DE

E

F

CEA

EEAEA

EDE

CAC

E

EC

CAC

Figure 218 Design flow

Figure 219 Inverter in Virtuososchematic

Figure 220 Inverter in Virtuosolayout

3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E

31 introduction

In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology

In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]

From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system

Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated

32 tsv parasitic capacitance

A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric

1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments

23

33 integrated capacitance measurement techniques 24

ABC

Figure 31 Isolated 2D wire capacitance

material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is

C =εox

hwl (31)

In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material

In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)

The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)

33 integrated capacitance measurement techniques

Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV

33 integrated capacitance measurement techniques 25

ABCDE

F

FD

F

F

FD

Figure 32 TSV parasitic capacitance

ABCDE FACDE

ECDE

BCFACDE

FCCDED DBCDE

D

DAB

Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor

33 integrated capacitance measurement techniques 26

characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories

1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]

2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]

The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer

On-chip techniques can achieve better accuracy when the DUT

capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]

In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows

CDUT =Im minus Ire f

VDD middot f(32)

The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of

34 electrical characterization 27

AB

AC

DE

FE

DE

FE

AB

AC

B

Figure 34 CBCM pseudo-inverters

Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot

The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET

capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz

should be adequate to avoid the MOS behaviour of the TSV

34 electrical characterization

A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented

34 electrical characterization 28

AB

CADE

F

AB

E

CAD

B

Figure 35 Current as a function of frequency in CBCM

on the same chip that went beyond the core objectives of thiswork

The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed

341 Physical implementation

In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result

Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics

34 electrical characterization 29

Poly dummy fills

Metal2Metal1PolysiliconActive

1μm 5μm

Figure 36 CBCM pseudo-inverter pair in layout

Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38

The CBCM pseudo-inverters were implemented on either TOP

or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below

Structure rsquoArsquo Single TSV capacitance (TOP die)

The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to

34 electrical characterization 30

DTOP DBTM

B10 A10

BREF AREF

B1 A1

R1 R2

SET VDD

RESET GND

C15 C10

C50 C20

SET VDD

ERC SET

EREF EC Metal2 (TOP)Metal1 (TOP)

Metal1 (BTM)Metal2 (BTM)

Figure 37 PRC1 test pad module(layout)

Figure 38 PRC1 test pad module(micrograph)

34 electrical characterization 31

Structure A B

TSVs - 1 2x5 - 1 2x5

Inverterlocation

TOP TOP TOP BTM BTM BTM

TSVconnection

TOP TOP TOP BTM BTM BTM

Multi-TSVconnection

- - TOP - - BTM

Multi-TSVpitch (microm)

- - 15 - - 15

Pad name AREF A1 A10 BREF B1 B10

Structure C D

TSVs 2x2 2x2 2x2 2x2 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP TOP

TSVconnection

TOP TOP TOP TOP TOP TOP

Multi-TSVconnection

TOP TOP TOP TOP TOP BTM

Multi-TSVpitch (microm)

10 15 20 50 20 20

Pad name C10 C15 C20 C50 DTOP DBTM

Structure R E

TSVs 1 1 - 1x2 1x2

Inverterlocation

TOP TOP TOP TOP TOP

TSVconnection

TOP TOP - TOP TOP

Multi-TSVconnection

- - - TOP BTM

Multi-TSVpitch (microm)

- - - 20 20

Pad name R1 R2 EREF EC ERC

Table 31 Test structures on the PRC1 module

34 electrical characterization 32

Pseudo-inverters

2x5 TSVs

TSV

Reference

Metal2 (TOP)Metal1 (TOP)

Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)

the single TSV DUT the structure also contains 10 parallel TSVs

arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude

Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)

The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side

Structure rsquoCrsquo Variable pitch TSV capacitance

The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the

34 electrical characterization 33

2x5 TSVs

Reference

TSV

Pseudo-inverters

Metal2 (BTM)Metal1 (BTM)

Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)

literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process

The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative

Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-

TOM die

The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking

34 electrical characterization 34

2x2 TSVs - 10μm

2x2 TSVs - 15μm

2x2 TSVs - 20μm

2x2 TSVs - 50μm

Metal2 (TOP)Metal1 (TOP)

Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die

34 electrical characterization 35

Metal2 (TOP)Metal1 (TOP)

Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy

Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)

Driver

Reference

Driver

Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)

Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy

Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability

Structure rsquoErsquo TSV capacitance effect on signal propagation delay

Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult

34 electrical characterization 36

ABC

D

ECBFC

ABC

Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)

ABC

DEF

CD

A

CA

A

Figure 316 Measurement setup

342 Measurement setup

The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316

The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-

2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup

34 electrical characterization 37

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

SET InputPulse generator

(Non-overlapping with RESET)

RESET InputPulse generator

(Non-overlapping with SET)

A[REF 1 10] InputSMU (Inverter source - 1V

supply)

B[REF 1 10] InputSMU (Inverter source - 1V

supply)

C[10 15 2050]

InputSMU (Inverter source - 1V

supply)

D[TOP BTM] InputSMU (Inverter source - 1V

supply)

R[1 2] InputSMU (Inverter source - 1V

supply)

E[REF C RC] Output Oscilloscope (high-impedance)

Table 32 PRC1 test pad instrument connections

sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32

343 Experimental results

There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods

When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy

34 electrical characterization 38

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 454

Mean capacitance 709 f F

Standard deviation (σ) 19 f F

Coefficient of variation (3σmicro) 80

Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo

The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability

Die to die (D2D) variability

The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)

On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid

On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691

34 electrical characterization 39

-60 -40 -20 00 20 40 60 80

σ-σ

706

Deviation from mean (fF)

0

005

01

015

02

025

Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function

Structure rsquoArsquo (Fig 39)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 414

Mean capacitance 895 f F

Standard deviation (σ) 22 f F

Coefficient of variation (3σmicro) 74

Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo

of the experimental data within plusmnσ of the mean capacitancevalue

Correlation between TSV capacitance and oxide liner thickness vari-

ation

The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers

Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication

34 electrical characterization 40

σ-σ

691

Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70

0

002

004

006

008

01

012

014

016

018

02

Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function

075

080

085

090

095

100

105

110

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

075

080

085

090

095

100

105

Wafer area

Figure 319 TSV capacitance measurement results on wafer D11

34 electrical characterization 41

086088090092094096098100102104106108

0

5

10

15

20

25

30

-30-25

-20-15

-10-5

Norm

aliz

ed

capacitance

Die X

Die Y

086

088

090

092

094

096

098

100

102

104

106

108

Wafer area

Figure 320 TSV capacitance measurement results on wafer D18

tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo

Wafer to wafer (W2W) variability

Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters

34 electrical characterization 42

109

100

095Nor

mal

ized

oxi

de th

ickn

ess

-5-10 -15 -20 -25 -30

510

1520

2530

0Die Y

Die X

Figure 321 Simulated oxide liner thickness variation

60 65 70 75 80 85 90 95 1000

005

01

015

02

025D11 D18

plusmnσ

plusmnσ

Capacitance (fF)

Figure 322 TSV capacitance W2W variability - Probability density func-tions

34 electrical characterization 43

Structure rsquoCrsquo (Fig 311)

TSV dimensions (ld) 40microm5microm

Oxide liner thickness 200nm

Total dies 472

Processed dies (excluding outliers) 424

Experiment C10 C50

Number of TSVs 4 4

Mean capacitance 4073 f F 4125 f F

Standard deviation (σ) 84 f F 74 f F

Population within σ 695 653

Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo

Variable pitch TSV capacitance

The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm

were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires

As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50

are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability

To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]

Measurement accuracy of CBCM

The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we

3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)

35 summary and conclusions 44

376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320

001

002

003

004

005

006C10 C50

Capacitance (fF)

plusmnσC10

plusmnσC50

695 653

Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function

-80 -60 -40 -20 00 20 40 60 80 1000

002

004

006

008

01

012

014

016

018σ-σ

688

Deviation from mean (fF)

Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function

observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs

connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance

35 summary and conclusions

In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement

35 summary and conclusions 45

Method LCR CBCM

Total dies 472 472

Processed dies (excluding outliers) 393 414

Mean capacitance 820 f F 895 f F

Standard deviation (σ) 23 f F 22 f F

Coefficient of variation (3σmicro) 86 74

Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo

results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]

The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration

The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV

parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM

In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process

4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E

41 introduction

The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV

filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance

Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models

In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]

42 tsv-induced thermomechanical stress

Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or

46

42 tsv-induced thermomechanical stress 47

Material CTE at 20 ˚C (10minus6K)

Si 3

Cu 17

W 45

Table 41 Coefficients of thermal expansion

Figure 41 TSV thermomechanical stress simulation [40]

polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate

Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction

It is well known that stress can affect carrier mobility in MOS-

FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance

Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type

43 test structure implementation 48

Figure 42 TSV thermomechanical stress interaction between two TSVs[40]

proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative

The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections

43 test structure implementation

TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible

43 test structure implementation 49

Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]

The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET

devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types

Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP

die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45

43 test structure implementation 50

A0

B0

C0

D0

E0

F0

G0

H0

I0

J0

A90

B90

E90

F90

G90

J90

DS3 DF3

DF2 DS0

DS2 DF0

DF1 DS1

SS SF

DEC6 VDD

DEC7 GND

DEC8 VG

DEC4 DEC9

DEC5 GND

DEC2 DEC3

DEC0 DEC1

Arraydecoder

Draindecoder

Figure 44 PTP1 test pad mod-ule (layout)

Figure 45 PTP1 test pad mod-ule (micrograph)

43 test structure implementation 51

Metal2Metal1PolyActive

TSV

MOSFETs

07μm007μm 0deg

07μm007μm 90deg

05μm05μm 90deg

05μm05μm 0deg

Figure 46 MOSFET dimensions and rotations

431 MOSFET arrays

Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)

The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs

lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44

The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)

Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution

43 test structure implementation 52

TSV

MOSFETs Substrate taps

Figure 47 1 TSV MOSFET array configuration

MOSFETs Substrate taps

TSV TSV

TSV TSV

Figure 48 4 diagonal TSVs MOSFET array configuration

43 test structure implementation 53

MOSFETs Substrate taps

TSV

TSV TSV

TSV

Figure 49 4 lateral TSVs MOSFET array configuration

MOSFETs Substrate taps

TSV TSV TSV

TSV TSV TSV

TSV TSV TSV

Figure 410 9 TSVs MOSFET array configuration

43 test structure implementation 54

Configuration RotationFET

width(microm)

FETlength(microm)

Name

0 0deg 07 007 A0

1 0deg 07 007 B0

4diagonal 0deg 07 007 C0

4lateral 0deg 07 007 D0

9 0deg 07 007 E0

0 0deg 05 05 F0

1 0deg 05 05 G0

4diagonal 0deg 05 05 H0

4lateral 0deg 05 05 I0

9 0deg 05 05 J0

0 90deg 07 007 A90

1 90deg 07 007 B90

9 90deg 07 007 E90

0 90deg 05 05 F90

1 90deg 05 05 G90

9 90deg 05 05 J90

Table 42 MOSFET array configurations for test pad modulesPTP1PTP2

43 test structure implementation 55

2μm 1μm

Metal2Metal1PolyActive

FET

DUMMY

TAP

TSV

Figure 411 MOSFET array detail

432 Digital selection logic

Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement

The procedure for selecting a single MOSFET device for meas-urement was as follows

bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled

bull MOSFET Gate terminal (G0 to G15) was connected to IO

pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates

bull MOSFET Drain terminal (D0 to D15) was connected to IO

pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder

43 test structure implementation 56

D0

D15

DF[0-3]

DS[0-3]

Drain-select decoder (2-bit)

SSF

SS

Array-select decoder (4-bit)

VG

Gate-select decoder (4-bit)

G0 G15

MOSFET Array

Figure 412 Digital selection logic

and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit

Drain-select decoder was adequate for accessing all 16Drain terminals in the array

When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)

A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating

43 test structure implementation 57

Gatedecoder

Metal2Metal1PolyActive

Transmissiongates (DS)

Transmission gates (G)

MOSFETArray

Figure 413 MOSFET array with digital selection logic

43 test structure implementation 58

BLK

A[0] A[1] A[2] A[3]

DEC[0-15]

Figure 414 4-bit Gate-enableArray-enable decoder

43 test structure implementation 59

DEC

VGGVDD

Transmission gate

Pull-downtransistor

Figure 415 Transmission gate for MOSFET Gate terminals

Transmission gates

FORCE

SENSE

DRAINSOURCE

EN

VDD

Figure 416 Transmission gates for MOSFET DrainSource terminals

43 test structure implementation 60

A[0] A[1]

DEC[0-3]

Figure 417 2-bit Drain-select decoder

44 experimental measurements 61

Configuration Type Name Wafer

1 No-TSVPMOS

(05microm times 05microm)F0 D08

2 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D08

3 1-TSV 0ordm 25 ordmCPMOS

(05microm times 05microm)G0 D20

4 1-TSV 0ordm 60 ordmCPMOS

(05microm times 05microm)G0 D08

54-lateral TSV 0ordm 25

ordmCPMOS

(05microm times 05microm)I0 D08

6 1-TSV 90ordm 25 ordmCPMOS

(05microm times 05microm)G90 D08

7 1-TSV 0ordm 25 ordmCPMOS (07microm times

007microm)B0 D08

Table 43 MOSFET array configurations selected for experimental eval-uation

44 experimental measurements

Test modules PTP1 and PTP2 implementing the presented MOS-

FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43

441 Data processing methodology

For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress

Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing

44 experimental measurements 62

ABCDEFBBC

FCDDFBCFFC

FCB

A

BAACADE

CECBBC

CFFCDE

B$BCFFBDBCBF

amp(BFC)B

BC

FCDDFA

CAF

Figure 418 Processing of measurement results

the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (

radic98) The described procedure for

processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not

placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)

442 Measurement setup

For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO

signal generation and measurement was provided by standard

44 experimental measurements 63

ABCDECDBF

ABECBF

AB

AB

Figure 419 Interpolation of measurement results

laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420

The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44

ABCCDE

F

AA

A

DBCB

Figure 420 Measurement setup

44 experimental measurements 64

Test pad Direction Instrument connection

VDD Power Power supply - 1V

GND Power Power supply - 0V

DEC[0-9] InputDigital Output

(GATEDRAINARRAY SelectDecoders Input)

DF[0-3] Input SMU (Drain Force terminal)

DS[0-3] Output SMU (Drain Sense terminal)

SF Input SMU (Source Force terminal)

SS Output SMU (Source Sense terminal)

VG Input Digital Output (Gate voltage)

Table 44 PTP1PTP2 test pad module instrument connections

443 PMOS (long channel)

No-TSV configuration

This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance

As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure

1 TSV configuration

In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo

44 experimental measurements 65

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

096

097

098

099

100

101

Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)

and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements

To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm

The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph

44 experimental measurements 66

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

105

110

115

120TSV

Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)

44 experimental measurements 67

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

09

10

11

12

13

TSV

Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)

we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV

even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after

increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

09

1

11

12

13

14

Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)

44 experimental measurements 68

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115TSV

Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)

at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)

4 lateral TSV configuration

In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430

respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)

90˚ rotation TSV configuration

In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has

44 experimental measurements 69

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115

12Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

085

090

095

100

105

110

115

TSV

TSV TSV

TSV

Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)

44 experimental measurements 70

2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309

095

1

105

11

115

12

X16

Distance (μm)

Normalized

Ion

TSV TSV

Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)

3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098

1102

Y14

Distance (μm)

Normalized

Ion

TSV TSV

Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)

44 experimental measurements 71

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

090

095

100

105

110

115

120TSV

Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)

been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm

444 PMOS (short channel)

In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

08509

0951

10511

11512

125Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)

44 experimental measurements 72

X

0 5 10 15 20 25 30

Y

0

5

10

15

20

25

30

094

096

098

100

102

104

106

108

110

TSV

Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)

measurements for X = 16 and Y = 14 in Figure 434 At at 1microm

distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm

14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308

085

09

095

1

105

11

115Y14X16

Distance (μm)

Nor

mal

ized

Ion

TSV

Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)

45 summary and conclusions 73

Size

Waf

er

Con

figu

rati

on

Rot

atio

n

Eff

ect

1micro

mla

t

Eff

ect

1micro

mlo

ng

KO

Z

L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm

L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm

L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm

L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm

L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm

S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm

Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices

45 summary and conclusions

In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET

device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors

The structure was implemented in Imecrsquos experimental 65nm

3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations

In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits

Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short

45 summary and conclusions 74

channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress

Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure

1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex

2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy

3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away

4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic

The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated

In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs

5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S

51 introduction

TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2

DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs

increases linearly with the number of tiers and interconnections(Figure 51)

The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm

node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)

In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]

52 energy-recovery tsv interconnects

As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-

75

52 energy-recovery tsv interconnects 76

1XTSV

Tier1

TSV

Tier0LOGIC 0

LOGIC 0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n Tier1

Tier0

TSV

LOGIC 0

LOGIC 0

TSV

LOGIC n

LOGIC n

Tier(m-1)

Tier(m)

mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-

tion density

recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]

EER prop

(

RCL

T

)

CLV2DD (51)

In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]

ECMOS =12

CLV2DD (52)

Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits

EER

ECMOSprop

2RCL

T(53)

From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small

In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC

52 energy-recovery tsv interconnects 77

Tier2Tier1TSV

LOGIC 1

LOGIC 2

driver

TSV

LOGIC 1

LOGIC 2

CMOS

driver

Convert

Energy-recovery

PCG

Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery

can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV

interconnectsThe differences between a typical CMOS driver in a TSV-based

3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form

Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values

52 energy-recovery tsv interconnects 78

IN IN

ININ

OUT OUT

IN

IN

OUT

OUT

Figure 53 Adiabatic driver

Figure 54 A simple pulse-to-level converter (P2LC)

An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded

The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)

For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with

53 analysis 79

VDD2

Rind

RTG

AdiabaticDriver

CL

M1

T=1f

L ind

RTG

CL

Figure 55 Resonant pulse generator with dual-rail data encoding

Tier2

Tier1

f-fPULSE

VDD2CLK IN

Delay1CLK1

CLK0

Delay2CLK2OUT1

Data In

Data Out

CLKRES

TSV TSV TSV

CLKBufferAdiabatic

Driver

P2LC

f-f

Data1

M1

Figure 56 Energy-recovery scheme for TSV interconnects

a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at

f =1

2πradic

LindCL(54)

A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56

53 analysis

The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy

53 analysis 80

losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme

The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be

EER = EAD + Eind + EM1 + EP2L (55)

Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs

architecture used and could be easily extracted from simulationdata as will be shown later in this chapter

531 Adiabatic driver

There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS

dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances

The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle

ETG = 2ξ

(

RTGCL12 T

)

CLV2DD =

π2

2(RTGCL f )CLV2

DD (56)

The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient

53 analysis 81

Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV

capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be

CL = CTSV + Cn + 6CD (57)

The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience

κTG = RTGCn rArr RTG =κTG

Cn(58)

Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2

DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver

EAD = CnV2DD +

π2

2κTG

Cnf [CTSV + Cn + 6CD]

2 V2DD (59)

The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2

2 κTG f to a variable (y = π2

2 κTG f ) Equation 59then becomes

EAD = D middot CnV2DD +

y

Cn[CTSV + (6b + 1)Cn]

2 V2DD

=

[

Cn

(

D

y+ (6b + 1)2

)

+1

CnC2

TSV

]

middot yV2DD

+(12b + 2)CTSV middot yV2DD (510)

Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when

53 analysis 82

they become equal Solving equation Cn

(

Dy + (6b + 1)2

)

= 1Cn

C2TSV

for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV

Cn(opt) =

radic

[D

y+ (6b + 1)2]minus1CTSV (511)

Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver

EAD(min) =

[radic

D

(6b + 1)2y+ 1 + 1

]

middot (12b+ 2)yCTSVV2DD (512)

532 Inductorrsquos parasitic resistance

The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as

Qind =1

Rind

radic

Lind

CLrArr Rind =

1QindCL2π f

(513)

Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513

Eind =π2

2(RindCL f )CLV2

DD =π

4CL

QindV2

DD (514)

533 Switch M1

Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1

Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as

EM1(min) = 2IM1(rms)VGM1

radic

mκM1

f(515)

54 simulations 83

Delay2

CLK1

CLK2Delay1

CLKW

PULSE

CLK0T

PD

OUT1

PULSE

P2L

RES

CLK

Figure 57 Energy-recovery timing constraints

534 Timing Constraints

The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57

The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)

54 simulations

To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the

54 simulations 84

M1IN IN

IN

OUT OUT

IN

Figure 58 Test case for the evaluation of the theoretical model

Q250MHz 500MHz 750MHz

T S e () T S e () T S e ()

6 360 409 -120 476 516 -78 578 613 -57

8 308 347 -114 417 449 -70 515 543 -52

10 276 312 -115 382 411 -70 477 498 -43

12 255 289 -119 358 386 -71 451 472 -44

Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)

adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58

The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51

The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations

In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency

54 simulations 85

5 6 7 8 9 10 11 12 130

10

20

30

40

50

60

70

T250 S250 T500 S500 T750 S750

Q factor

Energydissipation(fF)

Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)

5 6 7 8 9 10 11 12 13

-14

-12

-10

-8

-6

-4

-2

0

250MHz 500MHz 750MHz

Q factor

Error()

Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator

The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation

54 simulations 86

M1

TSV

TSV

TSV

Data in OutCMOS

OutER

Energy-recovery

CMOS

P2LC

Adiabaticdriver

Figure 511 Test case for comparison of energy-recovery to CMOS

541 Comparison to CMOS

When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F

for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is

described by Equation 52 Appending the data switching activity(D)

ECMOS = D middot 12

CLV2DD (516)

Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS

circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison

Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range

54 simulations 87

Figure 512 Improved P2LC design

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

Frequency (MHz)

Energy

dissipation(fJ)

Figure 513 Simulated energy dissipation of improved P2LC design

Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current

The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor

54 simulations 88

4 6 8 10 12 14 16 18 20 22-20

-10

0

10

20

30

40

50

60

800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor

Energyred uction()

Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)

4 6 8 10 12 14 16 18 20 220050

100150200250300350400450

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)

values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors

In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies

In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q

factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total

In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS

circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in

54 simulations 89

4 6 8 10 12 14 16 18 20 2200

100

200

300

400

500

600

700

Inductor M1 Adiabatic driver P2LC

Q factor

Energy

con trib

ution(

)

Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)

20 40 60 80 100 120 140 160 180 200 220-100

00

100

200

300

400

500

Q6 Q10 Q14 Q18

TSV capacitance (fF)

Energy

red u

ction(

)

Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)

Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value

Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz

Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm

node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS

54 simulations 90

4 6 8 10 12 14 16 18 20 22-40

-20

0

20

40

60

80

D=05 D=07 D=09 D=1

Q factor

Energyred uction()

Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)

4 6 8 10 12 14 16 18 20 220

10

20

30

40

50

60

130nm 65nm

Q factor

Energyred uction()

Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)

55 summary and conclusions 91

55 summary and conclusions

The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process

The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement

Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme

The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process

6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R

61 introduction

In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance

Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements

62 physical implementation

The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5

The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61

The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz

were calculated using the methodology developed in Chapter 5

and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery

circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)

92

62 physical implementation 93

AB

AB

CDEFD

C

ABCD

EFD

C

Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)

Adiabatic driver 80-bit circuit

Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )

RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)

CL 207 f F (Eq57) Lind 613nH (Eq54)

Table 61 Energy-recovery IO circuit parameters

62 physical implementation 94

IntegratedInductor

VINDOER

PulseGenerator

S1AS2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

Figure 62 Energy-recovery IO circuit schematic diagram

Metal2Metal1PolyActiveVDD

GND

RES_CLK

DATA

OUT

OUT_B

DATA

OUT_B OUT

RES_CLK

GND

VDD

Figure 63 Adiabatic driver (layoutschematic view)

the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs

621 Adiabatic driver

The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63

62 physical implementation 95

Metal2Metal1PolyActive

IN

OUT_B

VDD

IN_B

OUT

GND

OUT_B

IN

GND

VDD

OUT

IN_B

Figure 64 P2LC (layoutschematic view)

622 Pulse-to-level converter

The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals

623 Data generator

A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1

As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE

624 Pulse generator

The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the

62 physical implementation 96

Metal2Metal1PolyActive EN

CLK

VDD

Q

GND

D

CLKGND

ENVDD

QF-F

D

GND

VDDCLK

D Q

Figure 65 Data generator (layoutschematic view)

Delay1

WPULSE

PULSE

CLKTCLK

DATA

CLK_RES

Figure 66 Timing constraints for the energy-recovery demonstratorcircuit

62 physical implementation 97

PULSES1A

S2

GND

VDD

Metal2Metal1PolyActive

VDD

S1

S2

GND

PULSE

Figure 67 Pulse generator (layoutschematic view)

S2ΔPhase

WPULSE

PULSE

S1A

Figure 68 Pulse generator InputOutput signals

gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control

For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68

625 Integrated inductor

The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-

1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)

62 physical implementation 98

100 200 300 400 500 600 700 8005

7

9

11

13

15

17

19

21

10

15

20

25

30

35

40

45

Q (Simulated) Energy Reduction ()

Frequency (MHz)

Qfactor

Energyreduction()

Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS

lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)

As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited

For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610

626 Top level design

The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)

For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously

62 physical implementation 99

500μ

m

500μm

Figure 610 Planar square spiral inductor designed in ASITIC

shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance

The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG

80 = 16880 = 21Ω However as

the current flows to lower branch levels in the tree the perceivedresistance increases to RTG

16 RTG4 and on the last one to RTG Any

additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512

Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20

4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit

To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of

62 physical implementation 100

Adiabatic drivers

RTG

16

RTG

80

RTG

4

RTG

Level 1Level 2Level 3Level 4

Figure 611 Resonant pulse distribution tree

Branch level Wire resistance

1st 5 middot 16880 = 105mΩ

2nd 5 middot 16816 = 525mΩ

3rd 5 middot 1684 = 21Ω

4th 5 middot 168Ω = 84Ω

Total load 168+8480 + 21

20 + 05255 + 0105 = 252Ω

Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree

63 post-layout simulation and comparison 101

+-

+-Metal2 Metal1

180μ

m

100μm

Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)

the energy-recovery IO circuit in layout view is illustrated inFigure 613

627 CMOS IO circuit

The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance

The top level layout view of the CMOS IO circuit can be seenin Figure 616

63 post-layout simulation and comparison

The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed

63 post-layout simulation and comparison 102

22mm

12mm

GND GNDVBUF1 VDR

GND

VIND

S1A

S2

EN

OER

Figure 613 Energy-recovery IO circuit in layout view

63 post-layout simulation and comparison 103

OCMOS

S1B DataGenerator

TSV

CMOSDriver

CMOSDriver

TSV Buffer

1

80

EN

Figure 614 CMOS IO circuit schematic diagram

GND

VCMOS

DATA

078u

039u

105u

053u

316u

158u

950u

475u

OUT

Figure 615 CMOS driver schematic

12mm

12mm

OCMOS S1B GND VBUF2 VCMOSGND

GND GNDEN VCMOS

Figure 616 CMOS IO circuit in layout view

63 post-layout simulation and comparison 104

Energy dissipation (80-bits cycle)

Inductor 11pJ (Eq514)

Switch M1 092pJ [ ]

P2LCs 098pJ (Simulated)

Adiabatic drivers 315pJ (Eq512)

Total 615pJ

Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz

After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency

Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme

As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the

63 post-layout simulation and comparison 105

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14RES_CLK

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14DATA

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

8 9 10 11 12 13 14 15 160

02

04

06

08

1

12

14OER

Time (ns)

V(V

)

Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit

64 experimental measurements 106

AreaEnergy

dissipation(Theory)

Energydissipation

(Post-layout)

CMOS 144mm2922pJ

(115 f Jbit)

(Eq52)

1265pJ

(158 f Jbit)

Energy recovery 264mm2 615pJ

(77 f Jbit)

768pJ

(96 f Jbit)

Difference +83 minus333 minus393

Table 64 Post-layout simulated energy dissipation Area require-ments

theoretically calculated 333 up to 393 in the post-layoutsimulations

64 experimental measurements

The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively

641 Measurement setup

For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620

The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66

642 Results

CMOS IO circuit

In Figure 621 the input signal S1B and the output signal OCMOS

are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency

64 experimental measurements 107

Figure 618 Energy-recovery IO circuit micrograph

Figure 619 CMOS IO circuit micrograph

AABC

DEFF

D

EEBA

EBBA

Figure 620 Measurement setup

64 experimental measurements 108

Test pad Direction Instrument connection

VIND Power Power supply - gt06V

GND Power Power supply - 0V

S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)

EN InputDigital Output (enable outputdata switching)

OER Output Oscilloscope

VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V

VBUF1 Power Power supply (Buffers) - 12V

Table 65 Energy-recovery circuit instrument connections

Test pad Direction Instrument connection

VCMOS PowerPower supply (CMOS drivers) -12V

GND Power Power supply - 0V

VBUF2 Power Power supply (Buffers) - 12V

S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)

OCMOS Output Oscilloscope

EN InputDigital Output (enable outputdata switching)

Table 66 CMOS circuit instrument connections

64 experimental measurements 109

S1B

OCMOS

Figure 621 Oscilloscope output for signals S1B and OCMOS

100 200 300 400 500 600 700 800012345678910

Die1 Die2 Simulation

Frequency (MHz)

Powerdissipation(mW)

Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit

that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected

The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations

Energy-recovery IO circuit

Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER

64 experimental measurements 110

100 200 300 400 500 600 700 800024681012141618

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 623 Measuredsimulated energy dissipation for the CMOScircuit

S1A

OER

Figure 624 Oscilloscope output for signals S1A and OER

shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected

The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)

Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate

65 summary and conclusions 111

200 250 300 350 400 450 500 550 600 65002468101214161820

Die1 Die2 Simulation

Frequency (MHz)

Energydissipation(pJ)

Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit

IntegratedInductor

VINDOER

PulseGenerator

S1A

S2

DataGenerator

TSV

TSV

P2LCAdiabaticDriver

P2LCAdiabaticDriver TSV

TSV

Buffer

Buffer

1

80

1

80

EN

DecouplingCapacitor

200pF

Figure 626 Energy-recovery IO circuit with fault hypothesis

properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER

switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation

65 summary and conclusions

In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them

The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-

65 summary and conclusions 112

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050

055

060

065

070

075

080RES_CLK

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

02

04

06

08

1

12

14PULSE

Time (ns)

V(V

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500

02

04

06

08

10

12

14OER

Time (ns)

V(V

)

Figure 627 Simulation of circuit with fault hypothesis

65 summary and conclusions 113

200 250 300 350 400 450 500 550 600 6500

5

10

15

20

25

Die1 Die2 Simulation CL=200pF

Frequency (MHz)

Ene

rgy

diss

ipat

ion

(pJ)

Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis

ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS

circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated

CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

7C O N C L U S I O N S

Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density

As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV

devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs

Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-

FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process

Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process

The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work

114

71 main contributions 115

71 main contributions

From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data

The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations

In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design

72 future work 116

approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances

The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics

72 future work

The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV

capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]

In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes

1 The transistors should be in a regular grid in the array forsimple processing of the measurement data

72 future work 117

2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy

3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV

4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues

Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility

In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory

AA P P E N D I X A P H O T O S

Figure A1 3D65 300mm wafer (on chuck)

Figure A2 Probe card

118

appendix a photos 119

Figure A3 3D130 200mm wafer (on chuck)

ProbeHead

Microscope

Figure A4 Probe head

BA P P E N D I X B S TAT I S T I C S

mean (micro)

In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as

micro =1n

n

sumi=1

xi

median (m)

The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean

standard deviation (σ)

Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation

A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability

standard deviation of the mean (σmicro )

The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated

Assuming there is a data set X with n values xi of mean micro

120

appendix b statistics 121

variance(X) = σ2X

variance(micro) = variance( 1n sum

ni=1 xi) =

1n2 variance(sumn

i=1 xi) =1n2 sum

ni=1 variance(xi) =

nn2 variance(X) =1n variance(X) =

σ2Xn rarr

σmicro = σXradicn

coefficient of variation

The coefficient of variation is the ratio of the standard deviationσ to the mean micro

CV =σ

micro

It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage

outlier

Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set

kruskalndash wallis [34]

KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal

B I B L I O G R A P H Y

[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac

currentsourcebroadpurposemn=2602

[2] Max-3d URL httpwwwmicromagiccom

[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet

[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet

[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001

[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI

Design Conf pages 171ndash174 2005 doi 101109ICVD200564

[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf

3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581

[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009

[9] C H Bennett Logical reversibility of computation IBM

Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525

[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038

scientificamerican0785-48

[11] E Beyne 3d system integration technologies In Proc Int

VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113

122

bibliography 123

[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs

22nm-Details_Presentationpdf

[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327

[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534

[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices

Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124

[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942

[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc

IEEE Int Interconnect Technology Conf and 2011 Materials for

Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326

[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf

(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823

[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS

rsquo04 volume 2 2004 doi 101109ISCAS20041329255

[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE

Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260

[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-

bibliography 124

asitics for 3-dimensional integrated circuits In Workshop

Notes Design Automation and Test in Europe (DATE) 2009

[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT

2008 pages 98ndash103 2008 doi 101109IDT20084802475

[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec

beScientificReportSR2009HTML1213307html

[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905

[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE

Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329

[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int

Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311

[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508

[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712

[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii

B9780750677295500446

[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455

bibliography 125

[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591

[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test

Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889

[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on

Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10

1145224081224115

[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical

Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779

[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom

pressroompdfkkuhnKuhn_IWCE_invited_textpdf

[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5

(3)183ndash191 1961 doi 101147rd530183

[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827

[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190

[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf

(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839

[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-

nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079

bibliography 126

[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System

level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm

org101145966747966750

[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453

[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf

PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725

[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE

Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793

[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629

[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp

Exhibition (DATE) pages 1689ndash1694 2010

[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573

[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190

[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi

bibliography 127

A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices

Meeting (IEDM) 2010 doi 101109IEDM20105703278

[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727

[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10

1109JPROC1998658762

[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test

Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103

[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc

56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676

[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443

[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88

(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom

sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on

[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303

bibliography 128

[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium

on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145

10576611057669

[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872

[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004

[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-

sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888

[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156

[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc

IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522

[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501

[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667

[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477

[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-

tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609

bibliography 129

[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology

Conf 2006 doi 101109ECTC20061645678

[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power

Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220

[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc

Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786

[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-

Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338

[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088

[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044

[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int

Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763

[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-

tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http

dxdoiorg101007BF01788562 101007BF01788562

bibliography 130

[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-

electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww

sciencedirectcomsciencearticleB6V44-4829XDJ-2N

2f5b2c221dcf240c38cd13b7b3432f913

[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182

[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541

[78] NHE Weste and DF Harris CMOS VLSI design a cir-

cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775

[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite

printsteve_wozniak_interview

[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-

tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716

[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect

Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710

[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746

[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764

[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915

bibliography 131

[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610

  • Abstract
  • Publications
  • Acknowledgments
  • Contents
  • List of Figures
  • List of Tables
  • Acronyms
  • 1 Introduction
    • 11 Research goals and contribution
    • 12 Thesis outline
      • 2 Background
        • 21 3D integrated circuits
          • 211 3D integration approaches
          • 212 TSV interconnects
            • 22 Energy-recovery logic
              • 221 Energy-recovery architectures
              • 222 Power-clock generators
                • 23 Full-custom design flow
                • 24 Summary
                  • 3 Characterization of TSV capacitance
                    • 31 Introduction
                    • 32 TSV parasitic capacitance
                    • 33 Integrated capacitance measurement techniques
                    • 34 Electrical characterization
                      • 341 Physical implementation
                      • 342 Measurement setup
                      • 343 Experimental results
                        • 35 Summary and conclusions
                          • 4 TSV proximity impact on MOSFET performance
                            • 41 Introduction
                            • 42 TSV-induced thermomechanical stress
                            • 43 Test structure implementation
                              • 431 MOSFET arrays
                              • 432 Digital selection logic
                                • 44 Experimental measurements
                                  • 441 Data processing methodology
                                  • 442 Measurement setup
                                  • 443 PMOS (long channel)
                                  • 444 PMOS (short channel)
                                    • 45 Summary and conclusions
                                      • 5 Evaluation of energy-recovery for TSV interconnects
                                        • 51 Introduction
                                        • 52 Energy-recovery TSV interconnects
                                        • 53 Analysis
                                          • 531 Adiabatic driver
                                          • 532 Inductors parasitic resistance
                                          • 533 Switch M1
                                          • 534 Timing Constraints
                                            • 54 Simulations
                                              • 541 Comparison to CMOS
                                                • 55 Summary and conclusions
                                                  • 6 Energy-recovery 3D-IC demonstrator
                                                    • 61 Introduction
                                                    • 62 Physical implementation
                                                      • 621 Adiabatic driver
                                                      • 622 Pulse-to-level converter
                                                      • 623 Data generator
                                                      • 624 Pulse generator
                                                      • 625 Integrated inductor
                                                      • 626 Top level design
                                                      • 627 CMOS IO circuit
                                                        • 63 Post-layout simulation and comparison
                                                        • 64 Experimental measurements
                                                          • 641 Measurement setup
                                                          • 642 Results
                                                            • 65 Summary and conclusions
                                                              • 7 Conclusions
                                                                • 71 Main contributions
                                                                • 72 Future work
                                                                  • A Appendix A Photos
                                                                  • B Appendix B Statistics
                                                                  • Bibliography
Page 10: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 11: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 12: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 13: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 14: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 15: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 16: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 17: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 18: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 19: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 20: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 21: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 22: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 23: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 24: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 25: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 26: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 27: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 28: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 29: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 30: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 31: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 32: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 33: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 34: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 35: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 36: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 37: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 38: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 39: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 40: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 41: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 42: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 43: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 44: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 45: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 46: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 47: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 48: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 49: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 50: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 51: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 52: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 53: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 54: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 55: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 56: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 57: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 58: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 59: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 60: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 61: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 62: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 63: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 64: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 65: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 66: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 67: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 68: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 69: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 70: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 71: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 72: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 73: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 74: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 75: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 76: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 77: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 78: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 79: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 80: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 81: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 82: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 83: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 84: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 85: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 86: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 87: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 88: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 89: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 90: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 91: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 92: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 93: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 94: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 95: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 96: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 97: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 98: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 99: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 100: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 101: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 102: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 103: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 104: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 105: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 106: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 107: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 108: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 109: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 110: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 111: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 112: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 113: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 114: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 115: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 116: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 117: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 118: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 119: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 120: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 121: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 122: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 123: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 124: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 125: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 126: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 127: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 128: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 129: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 130: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 131: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 132: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 133: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 134: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 135: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 136: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 137: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 138: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 139: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 140: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 141: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 142: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 143: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 144: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 145: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 146: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 147: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 148: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate
Page 149: Optimizing the integration and energy efficiency of through …async.org.uk/tech-reports/NCL-EECE-MSD-TR-2011-173.pdf · 2011. 12. 2. · ative test structures are required to evaluate