Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype...

77
Confidential ICT-2009.3.2-248603-IP Modelling, Control and Management of Thermal Effects in Circuits of the Future WP no. Deliverable no. Lead participant WP7 D7.3.1 NXP-NL Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen Issued by THERMINATOR Project Office Document Number THERMINATOR/D7.3.1/v1 Dissemination Level Confidential Date 12/02/2013 © Copyright 2010-2013 STMicroelectronics, Intel Mobile Communication, NXP Semiconductors, GRADIENT DESIGN AUTOMATION , MUNEDA, SYNOPSYS , BUDAPESTI MUSZAKI ES GAZDASAGTUDOMANYI EGYETEM , CSEM, FRAUNHOFER , IMEC, CEA-LETI, OFFIS, Politecnico di Torino, ALMA MATER STUDIORUM -Universita’ Di Bologna, ST-Polito. This document and the information contained herein may not be copied, used or disclosed in whole or in part outside of the consortium except with prior written permission of the partners listed above.

Transcript of Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype...

Page 1: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

Confidential

ICT-2009.3.2-248603-IP

Modelling, Control and Management of Thermal Effects in Circuits of the Future

WP no. Deliverable no. Lead participant

WP7 D7.3.1 NXP-NL

Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen

Issued by THERMINATOR Project Office

Document Number THERMINATOR/D7.3.1/v1

Dissemination Level Confidential

Date 12/02/2013

© Copyright 2010-2013 STMicroelectronics, Intel Mobile Communication, NXP

Semiconductors, GRADIENT DESIGN AUTOMATION , MUNEDA, SYNOPSYS ,

BUDAPESTI MUSZAKI ES GAZDASAGTUDOMANYI EGYETEM , CSEM,

FRAUNHOFER , IMEC, CEA-LETI, OFFIS, Politecnico di Torino, ALMA MATER

STUDIORUM -Universita’ Di Bologna, ST-Polito.

This document and the information contained herein may not be copied, used or disclosed

in whole or in part outside of the consortium except with prior written permission of the

partners listed above.

Page 2: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Confidential Page 2 12/02/2013

Document Title Evaluation of thermal-aware design prototype tools

Type Deliverable CO

Ref D7.3.1

Target version V1_1

Current issue V0_1

Status Released

File

Author(s) Wilhelm Moering (NXP-D), A. Calimera (POLITO), A. Macii (POLITO),

A. Timar (BME), A. Szalai (BME), G. Nagy (BME), P. Knocke (OFFIS),

S. Rosinger (OFFIS), V. Melikyan (SNPS-AM), A. Ripp (MUN), H.

Oprins (IMEC), S. Stoffels (IMEC), A.Bartolini(POLITO)

Reviewer(s) S. Holland (NXP-D), D. Rossi (UNIBO)

Approver(s) G.Gangemi (ST)

Approval date 12/02/2013

Release date 12/02/2013

Distribution of the release Dissemination level CO

Distribution list

History Rev. DATE Comment

0.1 22-01-2013 Initial version

1.0 30-01-2013 Revised version, approved by all partners

1.1 12/02/2013 Check and ship out

Page 3: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Confidential Page 3 12/02/2013

References [1] Garrou, Ph., “Handbook of 3D-Integration: Technology and Applications of 3D

Integrated Circuits”, Wiley-VCH (Weinheim, 2008).

[2] Chanchani, R. “3D Integration Technologies – An Overview”, in Materials for

Advanced Packaging edited by D. Lu, C.P. Wong, Springer (2009), pp. 1-50.

[3] Beyne, E. “Through-Silicon via Technology for 3D IC” in Ultra-thin Chip

Technology and Applications, edited by J.N. Burghartz, Springer (2011).

[4] Marchal, P. et al., “3D technology roadmap and status”, Proc. IITC 2011, pp. 1-3.

[5] Gu, S. et al., “Stackable memory of 3D chip integration for mobile applications”,

Proc. IEDM 2008, pp 1-4.

[6] Brunschwiler, T.; Michel, B. ; "Thermal Management of Vertically Integrated

Packages," in Handbook of 3D Integration: Technology and Applications of 3D

Integrated Circuits, edited by P. Garrou, C. Bower and P. Ramm. Wiley-VCH

Verlag GmbH (Weinheim, 2008) Vol. 2, Part IV, pp. 635-649.

[7] Agonafer, D. et al., “Thermo-Mechanical Challenges in Stacked Packaging”, Heat

Transfer Engineering, Vol. 29 No. 2 (2008), pp. 134 – 148.

[8] J. Kim et al., “A 1.2V 12.8GB/s 2Gb mobile Wide-I/O DRAM with 4×128 I/Os

using TSV-based stacking,” in IEEE International Solid- State Circuits

Conference (ISSCC), Feb 2011.

[9] Alpha Company Ltd, “Passive heat sinks,” http://www.micforg.co.jp/en/cat

pass.html, 2012

[10] “Active heat sinks,” http://www.micforg.co.jp/en/cat fe.html, 2012.

[11] JEDEC Solid State Technology Association, JEDEC Standard JESD51-12:

Guidelines for Reporting and Using Electronic Package Thermal Information,

www.jedec.org, May 2005

[12] Intel, Ball Grid Array (BGA) Packaging, Intel Packaging Databook, Chapter

14, 2010

[13] Intel, Performance Characteristics of IC Packages, Intel Packaging Databook,

Chapter 4, 2010

[14] Intel, Physical Constants of IC Package Materials, Intel Packaging Databook,

Chapter 5, 2010

[15] Therminator Consortium, Deliverable D6.2.1: Framework overview for an all

level thermal simulation of 3D SiP stacks and 2D SoCs, 2010

[16] Therminator Consortium, Deliverable D6.2.2: Presentation and evaluation of

an all level thermal simulator of 3D SiP stacks and 2D SoCs, 2012

[17] Therminator Consortium, Deliverable D6.3.1: Specification and

standardization of the thermal aware design, optimization and exploration flow

and preliminary presentation of design techniques, 2010

[18] Therminator Consortium, Deliverable D6.3.2: Presentation and evaluation of

thermal-aware design techniques for 3D SiP stacks and 2D SoCs, 2012

[19] Therminator Consortium, Deliverable D6.3.3: Report on integration of the

individual optimization techniques, 2012

[20] Therminator Consortium, Deliverable D1.2.1: Specification of internal design

flows and environments and existing tool interfaces, 2010

[21] Therminator Consortium, Deliverable D1.3.1: Technical specification of test

cases and distribution to partners of concern, 2010

[22] Reef Eilers, Malte Metzdorf, Sven Rosinger, Domenik Helms, Wolfgang Nebel

Phase space based NBTI model Proc. of International Workshop on Power and

Timing Modeling, Optimization and Simulation (PATMOS), 2012

[23] Sven Rosinger, Malte Metzdorf, Domenik Helms, Wolfgang Nebel Behavioral-

Page 4: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Confidential Page 4 12/02/2013

Level Thermal- and Aging-Estimation Flow Proc. Of 12th

Latin-American Test

Workshop (LATW), p. 1-6, 2011

[24] G. Gangemi, FP7-Funding Projects THERMINATOR, SMAC, MANON

Overview, MUGM MunEDA User Group Meeting 2012, October 2012, Munich,

Germany

[25] Z. Abbas, M. Olivieri, A. Ripp, G. Strube, M. Yakupov, Yield optimization for

low power current controlled current conveyor, SBCCI 2012, September 2012,

Brasília, Brazil

[26] A. Colaci, G. Boarin, A. Roggero, L. Civardi, C. Roma, A. Ripp, M. Pronath,

G. Strube: Systematic Analysis & Optimization of Analog/Mixed-Signal Circuits

Balancing Accuracy and Design Time, SBCCI 2011 Brazil, September 2011, Sao

Paolo, Brazil

[27] N. Seller, Optimization of a 2.133GHz level shifter in 28nm, MUGM

MunEDA User Group Meeting 2011, Munich, Germany

[28] U. Trautner, M. Pronath Synopsys Custom and Analog Mixed-Signal

Overview & MunEDA WiCkeD Integration, MUGM MunEDA User Group

Meeting 2010, Munich, Germany

Page 5: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Confidential Page 5 12/02/2013

This page was intentionally left blank.

Page 6: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Confidential Page 6 12/02/2013

Contents Document ................................................................................................................................... 2 Distribution of the release .......................................................................................................... 2 References .................................................................................................................................. 3 1 Introduction ........................................................................................................................... 9 2 Evaluation of thermal models for three-dimensional integrated circuits (IMEC) .............. 10

2.1 Introduction ................................................................................................................ 10 2.2 Technical results ........................................................................................................ 11

2.2.1 Evaluation of thermal accuracy for the DRAM-on-logic test vehicle .................. 11 2.2.1.1 Thermal test vehicle description ....................................................................... 11 2.2.1.2 Thermal model DRAM-on-Logic stack ............................................................ 12

2.2.1.3 Experimental model validation ......................................................................... 13 2.2.2 Evaluation of the thermal-aware design prototype tools ...................................... 16

2.2.2.1 Design ............................................................................................................... 16

2.2.2.2 Experiments and results .................................................................................... 17 2.2.2.3 Measurable objectives ....................................................................................... 20

2.3 Conclusions ................................................................................................................ 21

3 Evaluation thermal-aware synthesis and optimization tools (POLITO, together with SNPS-

AM) .......................................................................................................................................... 22 3.1 Introduction ................................................................................................................ 22

3.2 Benchmark description .............................................................................................. 22 3.3 Thermal-Aware Optimization Techniques ................................................................. 24

3.3.1 ITD-Aware Dual-Vth Assignment ....................................................................... 25

3.3.2 Tunable Clock Tree .............................................................................................. 26

3.4 Results on the testbench ............................................................................................. 28 3.4.1 Validating the ITD-Aware Dual-Vth Assignment ............................................... 28

3.4.2 Validating the Tunable Clock-Tree Methodology ............................................... 30 3.5 Conclusions ................................................................................................................ 31

4 Evaluation of thermal and aging aware optimization flow for two-dimensional systems on

chips (OFFIS, together with UNIBO/ST, and CEA-LETI) ..................................................... 32

4.1 Introduction to evaluation of high-level thermal and degradation estimation and

optimization ........................................................................................................................... 32 4.2 Technical results ........................................................................................................ 32

4.2.1 Evaluation of green-function based thermal estimation and optimization based on

use case 5 ............................................................................................................................ 32

4.2.1.1 Introduction to use case 5 motion detection design .......................................... 32 4.2.1.2 Custom ASIC hardware accelerators power and area determination ................ 32

4.2.1.3 Use case 5 IC package properties ..................................................................... 36 4.2.1.4 Evaluation of developed estimation/analysis and optimization flow and tools of

WP6 37 4.2.1.5 Thermal- and degradation-aware optimization evaluation ............................... 41

4.2.2 Evaluation of green-function based thermal estimation based on Genepy design

of CEA-LETI ...................................................................................................................... 42 4.3 Conclusions ................................................................................................................ 44

5 Verification of simulator engines (BME together with POLITO) ...................................... 45 5.1 Introduction ................................................................................................................ 45

5.1.1 The simulator engines developed at BME............................................................ 45 5.1.1.1 Logi-thermal simulation .................................................................................... 45 5.1.1.2 Electro-thermal simulation ................................................................................ 45

5.1.2 Measurable objectives .......................................................................................... 45

Page 7: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Confidential Page 7 12/02/2013

5.2 Detailed description of the verification ...................................................................... 45

5.2.1 Verification of the logi-thermal simulator engine ................................................ 45 5.2.2 Comparison of the two logi-thermal simulator engines ....................................... 46 5.2.3 Real-world evaluation designs ............................................................................. 49

5.2.3.1 Ring oscillator containing 1000 inverter cells .................................................. 49 5.2.3.2 Test circuit from POLITO ................................................................................. 50

5.2.4 Verification of the electro-thermal engine ........................................................... 52 5.3 Conclusion ................................................................................................................. 59

6 Thermal effects in identification applications (NXP-D) ..................................................... 61 6.1 Introducion ................................................................................................................. 61 6.2 Technical results ........................................................................................................ 61

6.3 Conclusions ................................................................................................................ 66 The measurements of the NXP test chip and the results from Synopsys TCAD simulation

and modelling simulation tools match sufficiently (MO7.3.11). The characterization of the

diode-voltage over temperature is well in line with the theoretical expectation. The impact

of the encapsulation on the thermal behaviour with respect to the self-heating in silicon

could be demonstrated. .......................................................................................................... 66 7 Evaluation of simulation-based verification, optimization and RSM model generation

methodologies (MUN, together with NXP-D, and ST) ........................................................... 67 7.1 Introduction ................................................................................................................ 67

7.2 Technical results ........................................................................................................ 67 7.3 Conclusions ................................................................................................................ 69

8 Conclusions ......................................................................................................................... 70

9 Measurable objectives ......................................................................................................... 71

10 Publications and presentations ............................................................................................ 74

Page 8: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype
Page 9: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 9

1 Introduction

The main objective of WP7 is to validate the models, design techniques, and tools developed

within the Therminator project. WP7 is divided into three tasks. In T7.1, the validation of

thermal models of new devices, materials, and technologies has been done. The focus in T7.1

is on the device level, and the level of elementary building blocks to be used in large(r)

circuits. The effectiveness and usability of design techniques has been addressed in T7.2. In

T7.2, larger building blocks, parts of circuits, and test chips are used as test cases. The final

task, T7.3, is to benchmark and demonstrate the effectiveness of the developed EDA tools. In

this task, the validation addresses test chips and prototypes.

The evaluation of the developed EDA tools is the main topic of T7.3, and is addressed in this

deliverable. The evaluation and demonstration activities are done with the tools developed in

WP3, 4, and 6 of Therminator. For demonstration, the test cases of WP1 and examples

provided by individual partners are used. The test cases come from different fields of the

semiconductor industry, e.g. digital, analog, and RF. The prototype tools of WP3, 4, and 6

have been developed by Research institutes, universities, and EDA vendors. Demonstration is

mainly done on examples provided by or of interest to the industrial partners, i.e. ST and

NXP-D. These collaborations within T7.3 have been very useful in creating an effective

innovation infrastructure, in which novel ideas from universities, Research institutes, and

EDA vendors are applied in test cases of interest for the European semiconductor industry.

Evaluation of the prototype tools in Therminator is done using typical figures-of-merit such as

ease of use, accuracy, and integration within existing flows. The effectiveness of these tools is

demonstrated in the form of improvements such as less temperature-sensitive designs, higher

yield, and reduction in design times. Specific examples of these improvements will be given

in this report.

More specifically, the thermal models for the three-dimensional integrated circuits of IMEC,

developed in WP6, are evaluated by demonstrating their accuracy and ease of use on a two-

layer DRAM logic stack in chapter 2. In chapter 3, POLITO applies their thermal-aware

synthesis tools, developed in WP3, to digital parts of an MCU provided by ST. Improvements

to temperature-induced delays and clock screw rotation obtained with their tool are reported.

Next, OFFIS shows results on thermal- and aging-aware optimization on an example provided

by UNIBO/ST in chapter 4. The OFFIS tool is also compared against measurement data

provided by CEA-LETI from test-case 4. BME validates the accuracy of their logi-thermal

simulation tool with respect to other tools and measurements in chapter 5. The impact of self-

heating from encapsulation in identification applications is addressed in chapter 6. NXP-D

compares measurements from their test chip with simulation. Finally, MUN demonstrates the

effectiveness of their tools developed in WP6 by demonstrating improvements in yield on a

test case provided by NXP-D, and improvements in design time on a test case from ST.

The results of T7.3 are made tangible in terms of measurable objectives. An overview of all of

these measurable objectives is given in chapter 9 of this report. In this chapter, the measurable

objectives are also linked to Therminator’s project objectives. The novelty of the work is

shown in chapter 10, where all of the output in terms of journal papers and conference

contributions is collected.

Page 10: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 10

2 Evaluation of thermal models for three-dimensional integrated

circuits (IMEC) 2.1 Introduction

Three-dimensional (3D) integration of integrated circuits is considered a promising

technology for circuit design. It allows decreasing the form factor of today's systems and

eases the interconnect performance limitation. And it makes it possible to interconnect

multiple stacked dies, made in different process technologies [1],[2]. The cornerstones of this

technology are through-Si vias (TSVs) and microbumps, for which process solutions,

reliability and design rules are now becoming available [3],[4]. One of the most likely

applications of 3D technology is the integration DRAM-on-logic [5]. Thermal management

issues are considered one of the main potential showstoppers for 3D-integration [6],[7]. In

WP6, innovative methodologies for thermal modelling (T6.2) and thermal-aware design

optimization (T6.3) for systems and packages have been developed. In this deliverable, those

developed automated thermal-aware design capabilities are evaluated for a real demonstrator

using 3D-TSV technology. The test case selected for this evaluation is a 2 layer DRAM-on-

logic stack in a BGA package (test case 6).

Innovation metric:

Evaluation of one integrated tool allowing early system floor planning and exploration of

many system and physical options and their impact on thermal behaviour.

This is a novel thermal-aware design optimization tool that allows:

avoidance hot-spots and delay degradation,

electro-thermal coupling

reliability modelling

mechanical stress reduction

Measurable objective:

Thermal modelling accuracy within 5% with thermal measurements (MO7.3.1)

Ease of use of the design flow (MO7.3.2)

Speed of the complete design flow, that is from RTL to virtual layout (MO7.3.3)

Accuracy of the design flow within 15% (MO7.3.4)

Selected test case

2 layer DRAM-on-logic stack in a BGA package (test case 6)

Deliverable content

The evaluation of the thermal-aware design optimization of task T6.3 for the DRAM-on-logic

test case consists of two parts. In the first part (section 2.2.1), the thermal accuracy of the

underlying thermal models is evaluated for the packaged DRAM-on-Logic stack. In the

second part (section 2.2.2), the overall design flow is evaluated using the design of an

OpenSPARC processor at RTL level and of a wide IO DRAM chip to demonstrate the

capabilities of the tool chain and evaluate the speed of the flow.

Page 11: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 11

2.2 Technical results 2.2.1 Evaluation of thermal accuracy for the DRAM-on-logic test vehicle

2.2.1.1 Thermal test vehicle description

This deliverable reports the validation of the thermal models developed in T6.2, using a

packaged DRAM on logic stack. Such a package is schematically shown in Figure 1 (a). A

heterogeneous DRAM-on-logic chip stack is designed and fabricated to assess the technology

and design challenges for 3DIC applications. The logic die, with a thickness of 25µm, is

manufactured using 130nm technology in which Cu TSVs are integrated [4]. They have a

diameter of ø=5µm and a height of 25µm. The backside of the thinned wafer consists of a

thick polymer layer, serving as a passivation layer, a 10µm-pitch backside redistribution layer

(RDL). On top of the logic die, a thicker DRAM is stacked using TSVs and microbumps

50µm pitch. The standoff height between the logic and DRAM die is typically 13µm. A no-

flow underfill is used to cover the gaps between the micro-bumps. Figure 2 (left) shows the

stack of the DRAM and logic die and a detail of the CuSn microbump between the logic and

DRAM is shown in Figure 1(b). The entire stack is integrated into a FCBGA package

substrate, with the thinned die face down (Figure 3).

Figure 1. Left: Schematic cross section of the packaged DRAM on logic stack. The

arrows indicate the heat paths from the heat generation in the logic die to the cooling

solution; either through the top side of the package Qt or through the bottom side of the

package Qb. – Right: Schematic of the µbump geometry.

Figure 2. Left: Picture of the fabricated DRAM on logic stack – Center: Detail of the

thermal modules in the logic chip layout – Right: Detail of the layout of one heater

module on the logic test chip revealing the location of the 3 heaters and the 5 integrated

temperature sensors.

Page 12: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 12

Figure 3. Left – BGA package containing the DRAM on logic stack – Right: test socket

for thermal measurements.

The logic chip contains test structures for monitoring thermo-mechanical stress and

temperatures in a 3D stack, electrostatic discharge hazards, electrical characteristics of TSVs

and micro-bumps, fault models for TSVs, etc. Figure 2 (center) shows the layout of the logic

chip. As hotspots and the impact thereof on DRAM performance are a particular concern,

dedicated thermal test structures are integrated on the logic die to test the impact of hotspots

on DRAM refresh times. The thermal structures include resistive heaters and diodes as

temperature sensors and are grouped in 3 modules. Figure 2 (right) shows a detail of the

layout of the 3 thermal modules. Each module includes 5 temperature sensors and 3 heaters

with dimensions 50x50, 150x150 and 500x500µm2, mimicking logic switching. The heaters

have been made using BEOL resistors and are placed below the sensitive circuits of the

DRAM. Figure 2(right) shows the location of the logic temperature sensors in the center of

the heaters and in a corner of the larger heaters of 150 and 500µm. To assess the impact of the

CuSn microbumps on the heat transfer between the logic and DRAM, dummy microbumps

are added below the heaters modules 2 and 3, whereas no microbumps are present below

heater module 1

Table 1. Power dissipation in the heater modules during the experimental

characterization of the package DRAM on logic stack using the first experimental

configuration.

2.2.1.2 Thermal model DRAM-on-Logic stack

For the thermal experiment, the cooling is applied from the topside of the package, creating a

main heat flow path for the dissipated heat from the logic, through the DRAM to the external

heat sink. This case is representative for medium and high power applications. Since the

performance of the memory technology degrades rapidly above 105ºC, the additional hot spot

power dissipation in the logic is superimposed to a background temperature of 85ºC applied

the package. In this way, high local temperatures are created in the logic and the impact

thereof on the DRAM temperature can be evaluated. This test condition is realized by putting

Power dissipation Total power

Scenario W W/mm2 W W/mm2 W W/mm2

H1-50 H1-50 H1-150 H1-150 H1-500 H1-500

Heater Module 1, 9V 0.027 10.88 0.291 12.93 0.632 2.53 0.95

Heater Module 1, 11V 0.027 10.88 0.417 18.53 0.905 3.62 1.35

Heater Module 2, 9V 0.027 10.88 0.264 11.73 0.562 2.25 0.85

Heater Module 2, 11V 0.027 10.88 0.374 16.62 0.814 3.26 1.22

Heater Module 3, 9V 0.04 15.84 0.251 11.16 0.511 2.04 0.80

Heater Module 3, 11V 0.04 15.84 0.349 15.51 0.738 2.95 1.13

Page 13: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 13

the socket below an air streamer with a fixed temperature of 85ºC. During the test in this

configuration, one heater module is activated at a time. During the test all three heaters of that

module are powered with the same voltage. Table 1 shows the power dissipation and power

density values of all the heaters during an experiment with respectively 9V and 11V. The

experiment is repeated for each of the 3 modules. In this setup, the temperature is monitored

in the diodes of the logic die and at certain locations in the DRAM die, in the steady state

regime.

For the thermal simulations, the thermal model described in deliverable D6.2.2 is used. Figure

4 and Figure 5 show the results of the thermal model for the temperature distribution in the

logic and DRAM die respectively for the power dissipation specified in Table 1, the

modelling results are computed using the thermal compact developed in T6.2 and applied for

the 2 layer DRAM on logic stack.

Figure 4. Simulation results of the temperature distribution in the logic die.

Figure 5. Simulation results of the temperature distribution in the DRAM die.

2.2.1.3 Experimental model validation

Figure 6 and Figure 7 show the comparison between the modelling results and the

measurement results for the DRAM die and logic die respectively for the 2 different power

levels specified in Table 1. The results are shown for both the case with and without µbumps.

In Figure 8, the normalized temperature increase (normalized with respect to power

Page 14: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 14

dissipation) is shown for the logic and DRAM die. The normalization allows comparing the

impact of the µbumps using the measurement results for slightly different power dissipations.

Figure 6. Comparison between the experimental (markers) and model results (solid

lines) in DRAM chip for heater module 1 without dummy CuSn bumps and heater

module 2 with CuSn bumps.

Figure 7. Comparison between the experimental (markers) and model results (solid

lines) in Logic chip for heater module 1 without dummy CuSn bumps and heater

module 2 with CuSn bumps.

85

90

95

100

105

110

0 2 4 6 8

Tem

per

atu

re (º

C)

Distance (mm)

Mod1-9V-EXP

Mod1-11V-EXP

Mod1-9V-CTM

Mod1-11V-CTM

85

90

95

100

105

110

0 2 4 6 8

Tem

per

atu

re (º

C)

Distance (mm)

Mod2-9V-EXP

Mod2-11V-EXP

Mod2-9V-CTM

Mod2-11V-CTM

85

105

125

145

165

185

205

225

0 0.5 1 1.5

Logi

c te

mp

erat

ure

(ºC

)

Distance from H50 center (mm)

Heater module 1

Exp 9VExp 11VFEM 9VFEM 11V

85

105

125

145

165

185

205

225

0 0.5 1 1.5

Logi

c te

mp

erat

ure

(ºC

)

Distance from H50 center (mm)

Heater module 2

Exp 9VExp 11VFEM 9VFEM 11V

d50 d150c d150 d500 d500cd50 d150c d150 d500c

Page 15: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 15

Figure 8. Comparison between the experimental (markers) and calibrated model results

(lines) in the DRAM (left) and logic (right) for heater module 1 without dummy CuSn

bumps and heater module 2 with CuSn bumps.

From the data in Figure 6 and Figure 7, the accuracy of the thermal model can be evaluated.

Table 2 shows the difference between the maximum simulation and experimental results for

both DRAM and logic chip, with and without µbumps. The results of Table 2 demonstrate

that the measurable objective MO7.3.1 of modelling accuracy within 5% / 5ºC has been

achieved for the thermal models applied to the DRAM on logic test case.

Table 2. Relative difference between the modelling and experimental results for the

maximum DRAM and Logic temperature in the heater modules.

6

8

10

12

14

16

0 2 4 6 8

Ther

mal

resi

stan

ce (

ºC/W

)

Distance (mm)

µbump impact - DRAM temperature

Mod1-11V-EXP

Mod1-11V-CTM

Mod2-11V-EXP

Mod2-11V-CTM

0

20

40

60

80

100

120

0 0.5 1 1.5

Ther

mal

resi

stan

ce (º

C/W

)Distance from H50 center (mm)

µbump impact - Logic temperature

Mod1- EXP

Mod2 - EXP

Mod1 - CTM

Mod2 - CTM

d500d50 d150c d150 d500c

Tamb (ºC) T_exp (ºC) T_model (ºC) rel. error (%) Abs. error (ºC)

DRAM chip

Heater Module 1 85 99.8 100.4 -3.9 0.6

Heater Module 2 85 98.18 98.73 -4.0 0.55

Logic chip

Heater Module 1 85 207 202.12 4.2 -4.88

Heater Module 2 85 211 208.65 1.9 -2.35

Page 16: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 16

2.2.2 Evaluation of the thermal-aware design prototype tools

In this section, we will:

First, shortly describe the design that will be used to perform the evaluation of the

design prototyping tools;

We will then discuss the experiments that have been performed and the results

obtained;

Finally, we will discuss the measurable objectives for this particular experiment.

2.2.2.1 Design

To demonstrate the design capabilities of the flow, we chose as working example a fairly

complex SoC: the OpenSPARC T2 processor. The RTL sources of this design are available

through open-source license (although not all the features are available, namely memories

etc.). Lot of supplementary information about this design (e.g. floor plan descriptions for

global physical placement constraints, absolute power values and power breakdowns among

components, etc.) is available in the literature.

The floor plan of the processor is shown on the Figure 9. The chip is built around 8 core

subsystems (with L1 and L2 tag memories) clustered in 2 times 4 core regions: upper and

bottom cluster. In the middle of the chip we can find the crossbar, the main communication

infrastructure of the chip. On lateral stripes we can find actual L2 data memories and buffers

as well as the corresponding memory controllers.

The floor plan shows also some peripheral devices, placed on the bottom and in the middle of

the chip.

Figure 9. OpenSPRAC T2 processor

Wide IO memory is modeled as black box. In this example we were using the existing

information of the Samsung wide IO memory, available in [8] and shown on Figure 10. Note

that this particular memory is not following a JEDEC standard, but the design could be very

easily adapted to accommodate whatever DRAM configuration one might want to explore,

including JEDEC spec.

Page 17: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 17

Figure 10. Wide IO DRAM from [8].

To the OpenSPARC core we add a Wide IO DRAM controller, modeled as black box and

allowing logical/physical connection with the Wide IO DRAM.

2.2.2.2 Experiments and results

Obtaining design geometry

The design is first synthesized and then partitioned. In this particular case design partitioning

is fairly simple because there is only one entity to be moved on the top tier (WideIO DRAM).

After this operation, and by applying the appropriate scripts, we can extract all inter-die nets.

These nets will have, according to physical implementation scenario (in this case a memory

on the top of the logic, both dies oriented face down), an appropriate physical net model

attached. This model is mandatory to enable the correct physical interpretation of the net (this

model will be very different for the Silicon Interposer implementation for example).

Figure 11 shows the inter-die net model corresponding to the face-to-back integration of the

Wide IO DRAM on the top of the logic die. The pin on the logic die (e.g. the one from the

Wide IO DRAM controller) is connected through bottom die metal layers to the TSV, and

then the TSV is connected to the bottom-die bump-pin (actually ubump in this case). This

bump-pin is on the back-side of the bottom die and it is connected to the front bump-pin (also

ubump in this case) of the top die using back-side (redistribution) metallization layers

(RDLs). Note that this bump-pin is now on the front side (of the top, memory die in this case).

This bump-pin connects to the component of the top die through a certain number of the top-

die metal layers.

Figure 11. Inter-die net model for 3D-Stacked circuit: cross section (on the left) and the

appropriate physical model of the inter-die net (on the right)

After this stage we have 3D partitioned netlist of the design. This is illustrated in Figure 12,

where we show on the left, global physical constraints for the system (available in literature)

and on the right various system blocks (only the first level of the hierarchy is shown not to

clutter the figure).

TSVTier0/Die0

Tier1/Die1

U_A

U_B Tier0/Die0 Tier1/Die1

Page 18: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 18

The design can be now floor planned and then placed and routed at standard cell level. The

Figure also indicates the connectivity analysis to understand correctly the wiring requirements

of the circuit.

Figure 12. Physical constraints defining placement regions (on the left) and 1st

hierarchical level blocks for the OpenSPARC T2 processor (on the right) showing also

the connectivity

After placement and routing we dispose of all circuit parameters to feed power extraction

process before final thermal simulation (Figure 13).

Figure 13. Placed and routed design: logic (left) Wide IO DRAM (right) and backside

routing (right)

Obtaining power

Once the design is placed and routed, we can proceed in establishing power dissipation values

of the various system sub-components.

The power information could be brought to the design using the following techniques:

1. Back-annotation — if power dissipation values are known, design modules (at

whatever abstraction level) can be annotated with absolute static and dynamic power

values. This information is then propagated throughout the flow and used at later

design stages.

2. Power estimation based on gate/flop count — the tool can be used to make an

estimate of both static/dynamic power dissipation components based on gate/flop

count and activity assumptions (using the gate/flop information from the technology

files).

3. Functional/power simulation and accurate annotation — if appropriate functional

test benches have been written and power simulation performed using some dedicated

power simulation tools, the accurate information of the switching activity can be

provided to the tool using standardized activity file formats (namely SAIF).

DRAM

bank0

l2d

l2b

l2d

l2b

mcu

l2d

l2b

l2d

l2b

c2 c3 c7 c8

c0 c1 c5 c4

l2d

l2d

l2b

mcu

l2d

l2b

l2d

l2b

f

s

r

f

s

r

l2t l2t l2t l2t

l2t l2t l2t l2t

sii ccx sio

rdpdmu

fsr macesr

TSV arrays

Page 19: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 19

In the case of this particular experiment we used a combination of the first two techniques.

The third one has not been used since it can be very time consuming in data preparation and

simulation time. Anyhow, the choice of such method at this stage would be an overkill,

because the accuracy using SAIF files would be at the gate-level, while thermal simulations

that we will be using are much less accurate, typically 100x100um size.

Once the power simulation phase is completed, set of scripts developed for this experiment

are used to extract relevant geometry and power data from the design database. This data is

then formatted into appropriate file suitable for input to the Compact Thermal Model (CTM).

Thermal simulation

Once the input for the CTM, presented in D6.2.2, is ready we can, for a given stack

configuration, generate the thermal profile. The overall process of thermal simulation is

illustrated on Figure 14, where we show: a) power spatial distribution (top of the figure) of

the logic die (on the left) and the WideIO DRAM (on the right) and b) the corresponding

thermal spatial distribution.

Figure 14. Power distribution of the bottom (logic) and top (Wide IO DRAM) dies (top

of the figure), and corresponding thermal profiles

The above example indicates one design point, that is: for one power dissipation distribution

and stack configuration we generate one thermal profile. It is interesting to note that for a

given power distribution and the stack configuration, the temperature profile and the

maximum temperature in both dies will depend on the cooling solution.

Analysis of the thermal hotspots in the design indicates more elevated temperature in the zone

of the system crossbar. This is expected, because of the high power density in this area, on

both logic and memory dies. Also in the upper CPU cluster we also have a more elevated

temperature. This is because the boundary (mirror) effect is to be observed on circuit edges.

Since in the upper cluster the CPUs are actually closer to the boundary then the L2 tag

memories, the temperature in the upper CPU cluster will be higher then in the bottom CPU

cluster (see floor plan Figure 9).

Page 20: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 20

In order to study the impact of the cooling solution choice, we select five different junction-

to-air thermal resistance values starting at 1.1°C/W and going down to 0.3°C/W in 0.2 steps

— that is 5 different thermal resistance values.

For each thermal resistance value we report the maximum temperature in the Wide IO

DRAM, because the maximum temperature in the stack will be limited with the maximum

temperature in the DRAM. This temperature can't exceed 90°C without serious deterioration

of the DRAM performance (DRAM refresh rate is proportional to the maximum temperature

— higher the temperature, higher the refresh rate is).

Figure 15. Maximum temperatures in the Wide IO DRAM die for 3D stacked circuit

By analyzing the results of this experiment, we can draw the following conclusions:

Because of the high total power dissipation in the logic die (around 80W, this is still

high-performance computing example), compared to the Wide IO DRAM (300mW)

the maximum temperature in the stack is quite high and requires forced cooling to

keep the temperature bellow 90°C.

Using the appropriate cooling solution, — with the thermal resistance bellow

0.5°C/W —, it is possible to keep the maximum Wide IO DRAM temperature bellow

90°C.

Although the above mentioned thermal resistance falls in the category of high-

performance cooling systems ([9][10]), we do not need very advanced cooling often

mentioned in the literature for the high-performance 3D stacked circuits.

2.2.2.3 Measurable objectives

Here is the list of measurable objectives for this task:

Ease of use (MO7.3.2) — the flow is simple to use: the adaptation of the existing

template scripts in the case of the OpenSPARC took 2 days to complete (counting

certain synthesis time necessary to see if the design set-up is correct). Note that the

design is described using couple of hundred of VHDL files, and some amount of

manual editing is required to build the design environment.

Speed (MO7.3.3) — the complete design flow (that is from RTL to virtual layout)

takes:

Page 21: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 21

o Synthesis = 3h — this process understand the full design synthesis, not

counting memory macros. Note that for most of the prototyping purposes,

some of the design modules could be abstracted and the process could be

significantly faster.

o Place & Route = 1h — note that these steps are approximated: the placement

is not legalized and the routing uses approximate Steiner router. But as said

earlier on, at this stage this is more then enough.

o 1 thermal analysis point = few minutes — allowing many design points

calculations

Design flow accuracy (MO7.3.4) — Accuracy of the results obtained is ~15%. This

value is obtained by comparing design parameters after die prototyping and final

layout generation (for example area, max delay, power etc.). It is important to

understand that for design planning purposes this precision is more than enough.

2.3 Conclusions

This deliverable presents the evaluation of the integrated tool allowing early system floor

planning that was developed in task T6.3 and reported in deliverable D6.3.2. In the first part

of the deliverable, it is demonstrated that the measurable objective MO7.3.1 of model accuracy

within 5% for thermal models has been achieved for the thermal models applied to the

packaged DRAM-on-Logic test case (test case 6).

In the second part of this deliverable, the thermal-aware design prototype tools are evaluated

for a DRAM on logic design. It is demonstrated that the following measurable objectives have

been met:

Ease of use of the design flow (MO7.3.2)

Speed of the complete design flow (synthesis + place & route + thermal analysis).

(MO7.3.3)

The obtained design flow accuracy of the design parameters after prototyping is within

15% compared to the final layout generation. (MO7.3.4)

Page 22: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 22

3 Evaluation thermal-aware synthesis and optimization tools

(POLITO, together with SNPS-AM) 3.1 Introduction

Purpose of this part of the deliverable is to assess the quality of the prototype thermal-aware

synthesis and optimization tools developed by POLITO in T3.3 and T3.4. These tools are

tested on the benchmark provided by ST in T1.3 of WP1 in Therminator. For the sake of

completeness and for ease the reading of the document, we include a brief section that

describes the benchmark and a section that recalls the thermal-aware optimization techniques

described in deliverables D3.3.1-D3.3.3 and D3.4.1-D3.4.3 of Therminator.

3.2 Benchmark description

The test-case provided by ST is a subset skeleton of a typical MCU suitable for a wide range

of applications such as motor drives, application control, medical and handheld equipment,

industrial applications, inverters, printers, etc. It includes several general purpose IPs (DMA,

I2C, Timers, USART, SPI, USB, I/Os, etc.) and all the interconnect infrastructure, plus

memory controllers for embedded Flash, embedded RAM and external NVM (Figure 16). ST

delivered a gtech representation of the overall design (generic technology, a technology

independent netlist, good enough to apply any kind of synthesis flow on it) without the analog

IPs, not relevant for the purpose of the test-case.

Page 23: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 23

Figure 16: Test Case Block Diagram

All the external and internal pins are modelled in terms of input and output delays. The test-

case is made of a single voltage domain but support various power modes, depending on the

external power manager, and clock domain controls; a test aiding logic is also present as part

of the normal IP set of the MCU.

The main clock domain (see Figure 17) is a fast AHB one, running at full speed, pacing the

CPU, DMAs, embedded memories, external memory controllers and the Clock Controller

itself; this is interfaced to two APB domains (by means on two bridges) running at a lower

(ratio n) speed where all the remaining IP are instantiated.

Page 24: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 24

Figure 17: Test Case Clock Tree

3.3 Thermal-Aware Optimization Techniques

The thermal-aware design methodologies developed within the work package WP3 target two

levels of abstraction, the Gate-level/RTL and the architectural level. At the gate-level

POLITO developed a dual-Vt assignment algorithm that guarantees temperature-insensitive

operation of the circuits together with a significant reduction of both leakage and total power

consumption (task T3.3), while at the architectural level, it implemented a design framework

for post-silicon compensation of thermally-induced delays on the clock distribution network

(task T3.4). Both the solutions have been integrated with standard EDA tools provided by

Synopsys within the Galaxy Implementation Platform, and in particular, Design Compiler for

dual-Vth synthesis and IC-Compiler for adaptive clock trees. The next two sections provide a

brief overview of the twos. SNSP-AM has given support to POLITO, according to MO7.3.8.

Page 25: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 25

3.3.1 ITD-Aware Dual-Vth Assignment

It has been proven that CMOS technologies below the 65nm node, as those used in the

THERMINATOR project, suffer of Inverted Temperature Dependence (ITD). ITD manifests

on High-Threshold voltage cells (HVT cells), which show a delay reduction with temperature,

that is, they get faster as they warm up. This behaviour is in contrast with that of LVT cells,

which show a standard thermal dependence, that is, they get slower as they warm up.

The presence of ITD seriously complicates low-power design flows where dual-Vt

technologies are adopted to reduce leakage power consumption while keeping delay

overheads under control. The main limitation of standard synthesis approaches is that they do

not consider temperature as a direct variable in the optimization loop; in fact they follow a

more conservative approach in which cell libraries characterized under classical worst-case

temperature conditions (typically, 125°C). However, due to ITD effects on HVT gates, the

worst-case delay path may occur at the opposite corner, i.e., room-temperature conditions. By

ignoring such variables, standard synthesis tools can produce incorrect designs that do not

guarantee timing compliance over the full range of operating temperature.

To overcome this issue, POLITO implemented a new ITD-aware dual-Vt selection algorithm

(MO7.3.5) that achieves temperature-insensitive of digital circuit. With this solution, designers

are able to automatically synthesize circuits that meet the given timing constraints for all

allowable operating temperatures with a significant reduction of leakage power w.r.t. circuits

for which temperature-insensitivity is achieved by over-constraining the logic synthesis

process with thinner delay constraints.

Our proposed synthesis flow is illustrated in Figure 16. The flow has been set up with the

objective of achieving maximum compliance with existing commercial tools. We first

synthesize the target circuit using the nominal timing constraint Dnom and standard dual-Vt

libraries characterized at high-temperature (i.e., 125◦C). Synthesis at high temperature

guarantees worst-case parasitic extraction. We then estimate the worst-case delay of the

circuit at both 125◦C and 25

◦C using static timing analysis. Next, we re-synthesize the circuit

using the same libraries, but with a tighter timing constraint (i.e., D′nom= α·Dnom, with α < 1).

We choose α to be small enough such that the new circuit is timing compliant at both 125◦C

and 25◦C (i.e., worst case delay is less than Dnom). We note that the value of α is circuit

dependent and, the larger the ITD effect, the smaller the value of must be. At this point, we

have a solution which is compliant from a timing viewpoint, but it represents an upper-bound

for the leakage optimization problem. Using our temperature-insensitive dual-Vt assignment

algorithm, we attempt to recover some of this leakage power by searching for an optimal

threshold voltage assignment for all the cells in the circuit.

Page 26: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 26

Figure 18: Temperature-Aware Dual-Vt Synthesis Flow.

The proposed dual-Vt assignment strategy, whose details are reported in deliverable D3.3.3, is

thus based on over-constraining the synthesis using a delay constraint smaller than the

nominal one; the difference with respect to a conventional over-constrained solution is that

temperature dependence is considered during the Vt assignment process.

3.3.2 Tunable Clock Tree

On chip operating temperature variations have a significant impact on the performance of

global interconnects. It is well known that high temperatures increase interconnect delays,

further degrading circuit performance. This is mainly due to the linear dependency that exists

between temperature and the electrical resistance of metal wires. Temperature-induced delay

variations on interconnects are extremely critical for clock distribution networks (CDNs)

which typically span the entire die crossing different thermal regions. It is well known that

devices working at different temperatures may show significant performance mismatches.

This induces different branches of the CDN to have unbalanced delays, that is, branches

crossing hot regions get slower, while those crossing cold regions get faster. The resulting

difference generates clock-skews which may vary, dynamically, depending on the workloads.

POLITO proposed a dynamically tunable clock tree architecture that self-adapts the delays in

the clock tree under time-varying thermal profiles. Tunability is achieved by means of

Tunable Delay Buffers (TDB), made up of pair of inverters with a set of capacitive loads in

between them. The loads are implemented using transmission gates and NMOS transistors

connected to them, which can be activated using the dedicated control signals. Each activated

control signals add the corresponding load to the critical path, thus achieving variable delays

in discrete steps. We devised a hardware mechanism that allows selecting the appropriate

tuning of clock buffers so as to thermally compensate the skew variations induced by a given

thermal gradient. Two are essential elements of the architecture:

A set of on-chip temperature sensors that detect thermal variations.

An hardware Thermal Management Unit (TMU) that translates this variation into the

proper tuning of the buffers for compensating the possible increase of the clock skew.

Figure 19 depicts the scenario showing the buffers and the TDBs (shown as buffers with cross

arrows), the sensors (shown as diamonds), and the TMU.

Page 27: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 27

Figure 19: On-Line Skew Compensation Architecture.

Regarding sensors, we assume that their number and placement on the die is given.

Concerning the design of the TMU, it requires the identification of the number of buffers that

are tunable and the calculation of the amount of tuning required by each buffer. The TMU is

therefore designed based on data obtained from an off-line characterization step. The

characterization consists of the application of a set of thermal profiles (representing typical

operating conditions of the design); for each applied thermal profile, the optimal tuning values

are calculated solving different, independent instances of the optimization problem using a

software implementation of the algorithm. Once a reasonable number of profiles have been

applied, the various solutions thus obtained are merged using some criterion so to achieve an

overall set of tuning values. More implementation details can be found in the document

D3.4.2. In order to achieve minimum design overheads, the TMU has been physically

implemented by means of a lookup table filled with the tuning values computed for each of

the thermal profiles applied during the characterization process.

Figure 20 shows the conceptual architecture of the TMU, object of our validation. Upon

detection of a temperature variation, N sensors will route their corresponding readings

(properly encoded on a given number of bits) to the TMU. The input values will address the

corresponding entry of the table (one for each of the S thermal profiles), which contains a

tuning configuration of each of the m tunable buffers. The latter are then driven (properly

encoded on a given number of bits) to the required TDBs.

Figure 20: Architecture of the TMU

Page 28: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 28

3.4 Results on the testbench 3.4.1 Validating the ITD-Aware Dual-Vth Assignment

The proposed temperature-insensitive dual-Vth technique has been applied and benchmarked

on the arithmetic logic unit of the industrial microcontroller proposed as fourth test-case in

D1.3.1.

The circuit was synthesized using Synopsys Design Compiler using the low-power

technology library provided by STMicroelectronics and consisting of HVT and LVT cells.

We used HSPICE to characterize the standard libraries. The characterization has been done

for LVT and HVT cells at the two boundary cases (25°C and 125°C), and for two other

intermediate temperatures (75°C and 105°C). This allowed us to obtain a complete overview

of the temperature-induced effects along the entire temperature range.

After obtaining the netlist of the over-constrained synthesized circuit, we performed the

temperature-insensitive dual-Vth assignment. Accurate timing and power analysis were

carried out using Synopsys Prime Time in which we annotated signal statistics of the internal

nodes obtained through a post-synthesis gate-level logic simulation.

We compare the outputs of four different synthesis flows in order to show the limitations of

standard synthesis tools, as well as to demonstrate the superior potential of the temperature-

insensitive design methodology w.r.t. to a simple over-constrained approach. The first two

design flows (syn-125C and syn-25C) use classical dual-Vth synthesis, where it was assumed

a single worst-case temperature condition, 125°C and 25°C respectively. The third one, syn-

oc-125, is the over-constrained case, where the circuit was synthesized using a timing

constraint smaller than the nominal one, and for a worst-case operating condition of 125°C.

Finally, temp-ins uses the proposed methodology. The target timing constraint is 9ns, as given

in the specification.

Figure 21 plots, for each synthesis flow, the length of the critical paths normalized w.r.t the

target timing constraint given by the specification. The timing analysis was done at room

temperature (blue bars), two intermediate temperatures (75°C – orange bars, and 105°C –

yellow bars), and 125°C (green bars). The first observation concerns the results of standard

synthesis tools. Using the syn-125C approach, the timing constraint is met at 125C by

construction, but when the circuit operates at room temperature, a timing violation occurs

(7.5% of Dnom). This is due to the fact that a path which was non-critical at 125°C, and then

mapped with a majority of HVT cells, becomes slower at 25°C, causing a timing fault. A

similar problem can occur if we synthesized the circuit at room temperature (the syn-25C

approach). Now the timing violation occurs at high temperature, where the low-threshold cells

are slower. An obvious solution for the timing-fault problem is to over-constraining the circuit

(syn-oc-125). In this case, since the syn-125C approach generates timing violations, we

performed a 125°C synthesis under a timing constraint equal to 0.93 ·Dnom. As shown in the

plot, the available slack increases, and we can assure that even if the path delays increase at

room temperature, they will still remain below the nominal delay Dnom. At 25°C the critical

path violates the dummy constraint of 0.93 ·Dnom, but still meets the true nominal delay of

Dnom. The main shortcoming of this approach is that an over-constrained circuit consumes

more area and dissipates more power. Our methodology (temp-ins in the figure) helps to

address the problem of timing violation due to the ITD effect. At each temperature the critical

path is below the nominal delay and the temperature induced delay variation is below 5%,

assuring timing compliance at any operating condition. Note that the proposed approach does

not eliminate the ITD effect, but gives an effective way to account for it during the synthesis

Page 29: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 29

process. This result demonstrates the measurable objective MO7.3.6 “Comparison with circuits

obtained by traditional synthesis methodologies: Temperature induced Delay variation kept

below the variation reported in the simulation results in the range 25-125°C (less than 5%)”

Figure 21: Critical Path distribution over different operating temperatures

Figure 22 reports the leakage power (on the left) and total power consumption (on the right).

The over-constrained approach syn-oc has the largest leakage power dissipation at each of the

four temperatures considered. In the worst-case (125°C), the leakage overhead compared with

the syn-125C case is around 7% on average. In contrast, using the proposed approach, temp-

ins, leakage power is much smaller than all the other cases. For instance, at 125°C, we have a

leakage saving of around 29% w.r.t. the over-constrained case, and a 22% savings compared

to the syn-125C case. Total power dissipation is also reduced in the proposed dual-Vth flow.

As highlighted in the figure, the temp-ins solution shows total power dissipation that is, on

average, 4% lower than the syn-oc case. The nominal case syn-125C assures the minimum

energy consumption, but its functionality is not guaranteed in the entire temperature range.

Figure 22 Leakage power consumption (left) and total power consumption (right)

0.85 0.90 0.95 1.001.05

1.10

syn-25C

syn-125C

syn-oc-125C

temp-ins

125C

105C

75C

25C

normalized worst-case delay

nominal delay constraint

Timing faults

Page 30: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 30

3.4.2 Validating the Tunable Clock-Tree Methodology

In order to test and validate the effectiveness of the proposed TMU design and optimization,

we applied the flow depicted in Figure 23.

Figure 23: Characterization Flow.

The clock tree description has been formatted using the DEF format obtained using the

Synopsys Galaxy Platform IC; delay and skew calculation is done after parasitics extraction

using the Standard Delay Format (SDF) file. The various thermal profiles have been

generated by applying different realistic workloads to the benchmark. Temperature has been

obtained by jointly using placement and power consumption information. First, we compute a

breakdown of power on a block-by-block basis. The term ``block'' refers to an entity in the

top-level RTL description. Area and placement information, together with switching power

distribution, were then fed into a logi-thermal simulator which produced a thermal map for

each different workload. Maps correspond to the thermal states. From a given profile, we

calculated the thermal-dependent delay to each sink; the nominal delay (with uniform

temperature distribution) was extracted from the SDF parasitics file generated during physical

design. Insertion delay values were then used to determine the skew constraints to be solved

by ILP package. We have interfaced this solver with Matlab, where all the experiments were

executed.

Figure 24: Normalized results for skew compensation.

Figure 24 reports the results we have obtained in terms of skew compensation normalized

w.r.t. the original skew of the clock-tree, namely, the one obtained performing timing analysis

under a flat temperature distribution. Bars labelled as Thermal Skew refer to the circuit

designed w/o any thermal-aware strategy, whereas those labelled as Compensated Skew refer

to the skew of the clock distribution network optimized with our methodology. In the plot,

bars above 1.0 represent a skew violation. As one can observe with the proposed adaptive

strategy we can substantially compensate the skew induced by uneven temperature

0.8 0.9 1.0 1.1 1.2

Thermal Map 1

Thermal Map 2

Thermal Map 3

Thermal Map 4

Compensated Skew

Thermal Skew

Page 31: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 31

distributions, 14.4% on average, carrying it even below the nominal case. This result

demonstrates the measurable objective MO7.3.7 “Clock skew comparison prior and post

optimization 15% of clock skew reduction in the range 25-125°C”.

3.5 Conclusions

In this part of the deliverable, we have benchmarked the thermal-aware optimization

techniques and prototype tools developed within WP3. Results confirm that by applying the

proposed synthesis methodology we obtain a circuit that is temperature insensitive, with a

delay variation (induced by the temperature) that is less than 5% (Figure 21 in Section 3.4.1

demonstrates the measurable objective MO7.3.6 “Comparison with circuits obtained by

traditional synthesis methodologies: Temperature induced Delay variation kept below the

variation reported in the simulation results in the range 25-125°C (less than 5%)”) and a clock

skew reduction about 15% in the temperature range that goes from 25°C to 125°C (Figure 24

in Section 0 demonstrates the measurable objective MO7.3.7 “Clock skew comparison prior

and post optimization 15% of clock skew reduction in the range 25-125°C”).

Page 32: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 32

4 Evaluation of thermal and aging aware optimization flow for

two-dimensional systems on chips (OFFIS, together with

UNIBO/ST, and CEA-LETI)

4.1 Introduction to evaluation of high-level thermal and

degradation estimation and optimization The evaluation of the high-level thermal and degradation estimation and optimization is two-

fold.

On the one hand it bases on the motion detection use-case 5 that has been provided by

UNIBO/ST within the project. This part of evaluation is presented in Section 4.2.1. At first,

Section 4.2.1.1 gives a short introduction and motivation to this use-case. Next, the synthesis

and analysis of the containing custom ASIC hardware accelerators is presented in Section

4.2.1.2. Together with the floor plan this leads to the overall power distribution of the use-

case. Section 4.2.1.3 then describes the BGA IC package that has been modelled and used for

the subsequent analysis of Section 4.2.1.4. This analysis first presents the green-function that

has been characterized in order to cover the package characteristics in terms of the thermal

expansion being dependent on applied materials, layer thicknesses, and cooling equipment.

Afterwards, the green-based thermal simulation is applied and the results are compared to

low-level FDM-based thermal simulations that are obtained by HotSpot. The two approaches

are quantitatively compared and the results are discussed. Later on, in Section 4.2.1.5, the

thermal-aware optimization capabilities of the use-case and of the flow in general are

presented.

Secondly, in Section 4.2.2 the evaluation of the green-function aware thermal simulation is

presented for the Genepy Multi-Processor System-on-a-Chip platform of use case 4 being

provided by CEA-LETI. Therefore CEA-LETI provided the necessary data for the use case 4

design including the datasheet of the BGA IC package, a block-level floor plan and transient

power traces for the 4 cores within the test chip.

4.2 Technical results 4.2.1 Evaluation of green-function based thermal estimation and optimization based

on use case 5

This Section addresses the evaluation of the green-function based thermal estimation and

optimization flow developed by OFFIS in WP6 task T6.2 and T6.3.

4.2.1.1 Introduction to use case 5 motion detection design

The overall use case structure has been described in the “Technical specification of test cases”

deliverable D1.3.1 [21]. The power and area figures of included system-level IP components

have been provided by UNIBO to OFFIS. This includes estimates for the processor, a DMA,

an external memory controller, a hardware accelerator wrapper, a bus and multiple memory

cuts.

The power and area demand of the ASIC hardware implementation of the motion detection

algorithms have been characterized by OFFIS. Therefore, C-algorithms have been provided

by UNIBO that have been synthesized to RTL and estimated with the OFFIS PowerOpt

synthesis and estimation engine. Of course, any other high-level synthesis could have been

used for this analysis. In the following the synthesis constraints and results are presented in

detail.

4.2.1.2 Custom ASIC hardware accelerators power and area determination

Page 33: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 33

The custom ASIC part of the use case 5 consists of 9 different C processes, each of them

being synthesized to a separate design entity with a dedicated controller. As resource sharing

is not applied at process level, some of these processes are instantiated multiple times. In

addition, a top-level process exists for the overall program- and data-flow. For high-level

synthesis all of the processes have been passed to PowerOpt with the following set of

synthesis constraints:

- Generic 65nm semiconductor technology

- Ambient temperature: 25°C

- Supply voltage: 1.1V

- Frequency: 200MHz

- Optimization effort: Smallest area

- No constraints on resources, no pipelining or chaining

- No algorithmic optimization such as loop unrolling

Each of the processes is quickly synthesized to Verilog, simulated with a defined testbench

using Cadence NCVerilog, and the simulation results in terms of data pattern are back-

annotated to the operations of the algorithmic representation. Then, based on these realistic

data pattern, the synthesis is executed a second time in order to meet all synthesis constraints

and to optimize for the given constraints. The outcome of this synthesis is a RT-level datapath

for each process that is used for predicting the power and area and that can be exported to be

used in a subsequent RT- to gate-level synthesis. For estimation the internal component

library of a 65nm semiconductor technology is used that contains soft-macro models for all

available RT-level components.

Figure 25 gives an overview on the synthesis results for all processes including their power

and area estimates. In addition, the amount of controller-steps is given for each design entity.

Page 34: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 34

Figure 25: PowerOpt High-Level Synthesis results of motion detection algorithms

Page 35: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 35

In addition to the exported Verilog design, PowerOpt outputs an area and power report for

each of the design entity as described in deliverable D1.2.1 [20].

As described in D6.3.3 [19], these reports are read by the floor planner and the block-level

placing is performed. The resulting rectangle of Figure 26 is then placed next to the hardware

accelerator wrapper of the global floor plan provided by UNIBO leading to the final floor

plan of Figure 27.

Figure 26: ASIC hardware block-level floor plan

Figure 27: Block diagram of final floor plan of use case 5

Page 36: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 36

The overall die size is ~3.7x3.7mm and the ASIC hardware part amounts to about only 1% of

it.

Together with the provided power figures of the IP components of UNIBO, we obtained the

power distribution within the die shown in Figure 28 that is used for the analysis.

Figure 28: Power distribution of the use case 5 design

4.2.1.3 Use case 5 IC package properties

Since the IC package is not defined in the use case description but has a crucial impact for the

thermal performance, this subsection describes the applied package. In general a Plastic Ball

Grid Array (PBGA) with a lead count of 256 has been used as it is described in the Intel

Packaging Databook. PBGAs have become the most popular packaging alternatives for high

I/O devices in the industry [12].

The PBGA is assumed to have a square size with 17mm edge length. Thicknesses of layers

and properties of applied materials are as they are described in the Intel data book [12], [13],

and [14]. For example, the seating plane thickness (BGA layer thickness) is set to 0.4mm, the

molding compound thickness is set to 0.8mm, and the substrate thickness to 0.4mm. Further,

a natural convection without any active cooling properties is assumed.

Figure 29: PBGA package scheme

Page 37: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 37

Figure 29 shows a cross-section through the PBGA package as it has been modelled within

the thermal estimation flow using the package modelling technique proposed in [15].

The overall package is attached to a 2s2p PCB containing 2 power and 2 signal layers as

defined in the JEDEC guidelines [11]. The PCB layer thickness is 0.96mm in total. The

package is surrounded by air at an ambient temperature of 323K (~25°C).

4.2.1.4 Evaluation of developed estimation/analysis and optimization flow and tools of

WP6

The important measurable objective MO7.3.9 in this analysis is the comparison of the newly

created estimation results with results from well known low level tools and techniques. In

order to cover this MO, the green-function based approach has been evaluated against the low

level tool HotSpot, which is widely used in the scientific field. Further, HotSpot has been

adjusted to silicon measurements for the hardcoded IC package. Beside deviations in the

thermal prediction the runtime of both approaches are compared later on.

Figure 30: Error [K] of green-based vs. HotSpot thermal estimation in Kelvin for a

128x128 blocks grid

At first, a homogeneous power distribution has been assumed and the two approaches have

been compared. Figure 30 shows the error of the green-based thermal estimation in

comparison to the HotSpot internal discrete FDM. The main inaccuracies occur due to the

simplification to homogeneous layers in x- and y-direction. Further, heat is also dissipated

though the borders of the package and not solely in z-direction. As a main consequence the

error is marginal in the centre (below 0.1K) of the die and has its maximum at the corners

with a deviation of 0.25K. The average temperature in this analysis is 10K above ambient and

thus the resulting accuracy is very high.

The runtime of the low-level thermal simulation in HotSpot compared to the green-approach

is presented in Table 3. As it can be seen, the green outperforms low-level FDMs by far.

Page 38: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 38

XY-Resolution (at 7 Z-Layers) Runtime Hotspot [s] Runtime Green [s]

32 x 32 1.53 0.0003

64 x 64 8.41 0.0008

128 x 128 29.16 0.0049

256 x 256 80.55 0.0163

512 x 512 191.85 0.0706

1024 x 1024 360.06 0.1655

2048 x 2048 944.34 0.6278 Table 3: Comparison of thermal estimation runtime HotSpot vs. Green

The use case 5 analysis is based on the package-characteristic green-function presented in

Figure 31. In contrast to the general evaluation of the approach that is based on the HotSpot

internal and hardcoded IC package this green-function has been characterized for the package

described in Section 4.2.1.3. It shows a high temperature peak of about 23K above ambient

temperature as an impulse response. Further it shows the temperature distribution in the

neighborhood of the sample power of 0.02W that is placed in the center of the die.

Figure 31: Green function for used PBGA package

Based on this impulse response characteristic of the PBGA package, Figure 32 presents the

steady state temperature distribution within the die of use case 5 including the effect of

electrothermal coupling. Since the overall power consumption is limited to below 140mW

and the selected PBGA package can dissipate up to 1.5 Watts at a low ambient temperature of

298K or about 0.8W for an ambient temperature of 323K, the thermal simulation results in a

maximum temperature increase of about 16K to the ambient temperature.

The variance within the die temperature distribution is very high, so there are also regions

close to the ambient temperature. This is because the 3.7x3.7mm die is attached to a

17x17mm PCB substrate and is covered by molding compound dissipating the temperature to

a large area. In average, the temperature is about 7.4K above the ambient temperature.

Page 39: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 39

In comparison to the temperature estimates of UNIBO presented in D6.3.3 [19], the

temperature predictions as well as the temperature gradients vary because of a different IC

package. The modelled package of this work has a characteristic thermal resistances θCA

between IC case and ambience of about 60°C/W. In accordance to [13] this value represents a

middle-class package without any active cooling. Heat spreaders, heat sinks or air flow would

reduce the θCA parameter by a factor of up to 6 but then the dissipated power would not be

sufficient for a significant temperature increase.

Figure 32: Steady state temperature distribution of use case 5 design

The temperature distribution of Figure 32 has also been evaluated against a low-level explicit

FDM simulation. Figure 33 shows the deviation map for the use case 5 design. The green

simulation shows maximum errors of 2.7% at the top corners at an average error of 1.52%.

Page 40: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 40

Figure 33: Deviation of green-function based thermal estimation in comparison to discrete

FDM simulation for use case 5

In order to analyze the long-term degradation of the use case 5 design, the NBTI models

described in [16] and published in [22] have been applied. Therefore, the floor plan is cut into

11x11 blocks and the maximum temperature and supply voltage at each location is fed into

the NBTI models. These models then compute transient traces for the duration of two years.

As the models are characterized for a time resolution of 1h, they are applied 17088 times for

each block. The analysis runs for about 28 hours and results in the following maximum

threshold voltage increase map. Of course, the granularity can be reduced to a finer resolution

at the costs of higher runtime.

Figure 34: Threshold voltage increase after two year degradation

As it can be seen the degradation correlates with the temperature map. In maximum, the

threshold voltage increases by 17.2mV while the mean degradation is 12.5mV.

Page 41: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 41

The threshold voltage increase describes the worst case degradation that can occur for every

single PMOS transistor within the considered block. Although it is a pessimistic assumption

to have no phases of relaxation for the transistors, it is a common approach in literature to

assume that at least one transistor exists which will operate under this condition.

4.2.1.5 Thermal- and degradation-aware optimization evaluation

As described in D6.3.2 the developed optimization techniques impact the high-level synthesis

as well as the ASIC HW floor planner by constraints in order to provide thermal- and

degradation-aware results. The potential of these approaches have also been demonstrated at

an example design investigated in deliverable D6.3.2 [18].

In use case 5 the custom ASIC hardware part occupies only a small fraction (about 1%) of the

total die size. Further, all different C processes are very similar to each other because each of

them consists of a two-folded loop with little array-computation inside. The resulting power

density is thus also very similar and in consequence temperature gradients within the custom

ASIC hardware block are small.

For these reasons the evaluation of the thermal- and degradation-aware optimization focuses

on the global floor plan and is not limited to the small custom ASIC block. Figure 35 shows

the optimized block diagram of the UNIBO floor plan optimization presented in D6.3.3 [19].

The main difference is that the memory cuts are placed in the centre of the die, leading to a

more homogeneous power and temperature distribution. With this floor plan the peak

temperature is predicted to be 3K lower than the one of the non-optimized floor plan.

Figure 35: Block diagram and steady state temperature distribution of use case 5 design with

optimized floor plan of UNIBO

Page 42: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 42

Figure 36 shows the degradation map of the optimized design. The maximum threshold

voltage increase is 15.6mV and thus about 9.3% lower than in the non-optimized case while

the mean degradation remains at 12.5mV. This result demonstrates the measurable objective

MO7.3.9 of quantification of the improvements reached through the thermal and aging aware

optimizations.

Figure 36: Threshold voltage increase after two year degradation with optimized floor plan

4.2.2 Evaluation of green-function based thermal estimation based on Genepy design

of CEA-LETI

In order to evaluate the thermal estimation approach based on real-life measurements, CEA-

LETI provided the necessary data for the use case 4 design. This includes the datasheet of the

BGA IC package, a block-level floor plan, as well as transient power traces for the targeted

FFT execution. Regrettably, no detailed power values are available for the surrounding

components.

Page 43: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 43

Figure 37: Steady state temperature distribution of use case 4 design.

Figure 37 shows the predicted steady state temperature distribution of the use case 4 design.

As the power dissipation for the FFT execution is far below 1W and the BGA package can

dissipate multiple Watt without the need for active cooling, the peak temperature increase is

very limited and the overall IC package will not heat up significantly due to the FFT

execution.

The real-life measurements were obtained for the steady state in which the FFT is repeatedly

executed in order to compare the results to the simulation. The temperature plot in Figure 38

has been provided by CEA-LETI. It can be separated into three phases: During the first and

third 600 second frame the cores are idling, while the FFT is repeatedly executed in the

second phase. As it can be seen, the idle temperature is approximately 34°C (307K) and the

measurements show a temperature increase of about 1.5K (up to ~35.5°C) during the phase of

FFT execution. This fits very well to the predicted temperature increase of 1.5K in Figure 37,

addressing MO7.3.14 of comparing measurements data against simulations.

Figure 38: Temperature measurements provided by CEA

The following list summarizes remaining inaccuracies between estimation and real-life

measurements and discusses the impact:

Page 44: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 44

Only power consumption of SMEP cores known and taken into account: As no

detailed power values are available for the neighbouring IP level components of the

die, the idle temperature of 34°C has been used as ambient temperature for the

simulations.

Power consumption is assumed to be distributed homogeneously throughout the

SMEP cores: The temperature sensors are placed closed to the centre of the core and

measure the temperature only at this specific point. As a result, the real temperature

might be slightly higher or even lower than the estimated one dependent on the exact

spatial power distribution.

Electrothermal coupling not considered for target technology: As the technology

data of the semiconductor technology that has been used for manufacturing the

samples is not available and distributed to the partners, temperature-dependent leakage

models have not been characterised. As a consequence, electrothermal coupling can

only roughly be estimated. Further, a power breakdown, differing between static and

dynamic power is not available. As the power dissipation is very low in this use case

the temperature increase is also very limited. Thus, electrothermal coupling is

neglectable in this design.

IC package material uncertainties: The package modelling bases on available

datasheets that lack of precise material characteristics. Thus, during modelling, typical

thermal conductivities have been assumed. This assumption will lead to realistic

results and will only have a minor impact.

4.3 Conclusions The proposed multiphysics estimation flow for 2D SoCs taking into account the power

dissipation and place & route information has not been available before THERMINATOR.

The results of this multiphysics estimation flow for 2D SoCs has been presented in numerous

deliverables ([15]-[19]) and publications [22][23]. The measurable objective MO7.3.9 of

comparison of thermal estimations to estimations done by well known low-level tools has

been demonstrated in Figure 30.

The speed up of the proposed flow in comparison to a low-level analysis is significant. Once

the package has been characterized the thermal simulation estimate can be obtained multiple

times a second and the electro-thermal coupling effect results in a steady state very quickly.

The speed up factor of the pure thermal simulation is about 1500 in comparison to a low-level

FDM simulation. In addition the proposed NBTI models have been evaluated to show a

speedup of 600 in comparison to the explicit reaction-diffusion solving. Thus, in total, the

flow enables multiple design tradeoffs regarding the synthesis parameters and the floor plan

as well as long-term reliability analyses.

Use case 5 has shown a peak temperature of 16K above ambient while the mean temperature

increase is 7.4K. Assuming this as a steady state for a duration of two years, the NBTI

degradation models show a maximum increased threshold voltage of PMOS devices of

17.2mV. This degradation highly correlates to the temperature distribution and has its peak at

the point of highest temperatures. The models have been evaluated to have relative errors

below 10% in comparison to an explicit calculation in [22].

The measurable objective MO7.3.9 of quantification of the improvements reached through the

thermal and aging aware optimizations is demonstrated by a 9.3% lower maximum threshold

voltage shift (Figure 36). Furthermore, use case 4 has shown the compliance between steady-

state simulation and silicon measurements (MO7.3.14). The measured temperature trace shows

only small variations of about 15% around the predicted value.

Page 45: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 45

5 Verification of simulator engines (BME together with POLITO)

5.1 Introduction 5.1.1 The simulator engines developed at BME

5.1.1.1 Logi-thermal simulation

Logi-thermal simulation is a novel method that is capable of determining the thermal

behaviour of digital systems at the gate level.

Due to the uneven event density on the chip surface, digital blocks normally experience time-

variable temperature gradients. Due to the fact that digital gates show a dependence on

temperature, a merely digital, gate-level simulation may strongly deviate from the actual

simulation results when the surface temperature profile of the chip is considered in calculating

the actual delays of the individual gates. Self-consistency between the thermal behaviour and

the digital behaviour of the chip has to be maintained: logic and thermal operation have to be

traced together. When this is assured: the simulation is logi-thermal simulation. The major

application of such a simulation is to make sure that during timing analysis thermal effects are

considered (thermal-aware signal integrity check). In other words, to make sure that the

digital circuit will properly function under all allowed thermal conditions.

The Celltherm engine developed in Work package 3 (T3.1) uses commercial engines and

glues them together to enable them to perform a logi-thermal simulation. It uses a Fourier

based method for thermal compact model generation. Using these compact models during the

relaxation process the logic simulator iterates with a Fourier based analog thermal solver

which presents the results of the thermal domain.

5.1.1.2 Electro-thermal simulation

The electro-thermal simulator engine developed in Work package 4 (T4.1) is a Spice-

compatible electrical simulator with a thermal extension. Every circuit element's model is

extended with a thermal node. These nodes connect the electrical and thermal parts of the

circuit. Thermal behaviour is modelled by electric equivalent RC circuits: compact models

created using package models and the layout.

The engine is based on the direct method thus the iterative solution takes place simultaneously

for the thermal and electrical sub-networks. This makes it possible to consider very fast

changes and to take feedbacks between the two domains into account. Another advantage is

that it permits AC simulation as well.

5.1.2 Measurable objectives

Logi-thermal simulator engines use simplified dissipation models for logic gates and cells to

speed up simulation times. The accuracy of models were verified by transistor-level Spice

simulations. The objective MO7.3.10 was to achieve a matching within 5%.

On the test design from POLITO (done with STMicroelectronics STD cells) we have

successfully performed logi-thermal simulations and determined power density as well as the

evolved hot-spots on the surface.

The electro-thermal engine performs an electric simulation on electric and thermal models, so

it was verified by a commercial Spice simulator. In this case measurement results of two

integrated circuits were also available and the matching between the simulations and the

measurements also showed a good agreement (< 10%).

5.2 Detailed description of the verification 5.2.1 Verification of the logi-thermal simulator engine

The CellTherm simulations have been verified with several approaches. From the SPICE

simulation of the transistor-level netlist of the design, the dissipations and delays of the cells

has been extracted. Dissipation and timing data could have been also extracted from a Liberty

database of the process. A Liberty file in our process node (TSMC 0.35 µm) was not present,

Page 46: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 46

so we needed to extract the parameters from SPICE simulations manually (Liberty databases

claim to be inside 2% of accuracy compared to SPICE simulations).

The SPICE simulation time was 64 μs. This was enough to reliably sample power, delay and

frequency of the circuit. The average power dissipations were compared to the dissipations

measured by CellTherm. As a next step, the equivalent thermal Foster RC networks that

represent the structure and layout were transformed to a SPICE compatible netlist. The

average cell powers calculated in the previous step has been fed into the Foster RC network

that resulted in the temperature functions of the cells. Finally, these functions simulated with

SPICE and the CellTherm temperature curves were compared. The difference between the

SPICE and CellTherm temperature curves was less than 0.16%. The calculated difference

function can be seen in Figure 39.

Figure 39. Difference between CellTherm and SPICE results

Another validation has been done to further verify the correct operation of CellTherm. A

simulation has been run using CellTherm on the demonstration circuit until steady-state

temperature. The previously mentioned SPICE netlist simulation was executed on the steady-

state temperature measured with CellTherm in the last step. The output frequency of the ring

oscillator was measured in both the CellTherm and SPICE results. The SPICE simulation

resulted in fosc = 1.453 GHz. CellTherm measured fosc = 1.524 GHz. The difference is below

4.82%. This difference could be further eliminated by providing delay values for every type

of transitions (01, 10, 0x,1x, etc.) in the SDF file.

5.2.2 Comparison of the two logi-thermal simulator engines

The two logi-thermal engines (CellTherm and Logitherm) has been verified with a

10 mm × 10 mm standard cell digital circuit with a 4-bit D-flip-flop chain and a ring oscillator

circuit. This design is a fictional design and cell sizes are intentionally enlarged to be able to

demonstrate the effect of temperature variations and evolving hot-spots on cell propagation

delay. Power dissipations for logic transitions in the cells are also fictional values large

enough to spectacularly demonstrate the mentioned effects.

The same physical layout definition and delay descriptor SDF file was used in the engines.

In Figure 40 the schematic layout of the design is shown. In the upper part of the chip the four

D-flip-flops form the exciting circuit. The dissipated powers in the DFFR cells are

intentionally chosen to be 1000-times larger (1mW) than the inverters' dissipated power per

logic transition (1 µW). In the lower part of the layout is the ring oscillator formed by 10

inverters and a kick-in NAND gate.

Page 47: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 47

Figure 40 The schematic layout of the design

The physical layout of the circuit is shown in Figure 41

Figure 41 The physical layout of the circuit.

Figure 42 shows the comparison of the logic simulation results. The waveforms of the two

engines are plotted together to show the accordance.

Page 48: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 48

Figure 42 The waveforms of the logic simulation

Figure 43shows that the calculated power dissipation of the logic elements also matches in the

two engines.

Figure 43 Power dissipation diagram of an inverter in the chain.

Finally, Figure 44 shows the transient simulation of an inverter's temperature calculated by

the engines. The temperature as a function of time can be seen on the left hand side of the

figure while the relative error of the results is shown on the right. The initial large error is due

to differences in the numeric algorithms used by the two engines. The error decreases below

5% percent with the time-constant of the system.

Page 49: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 49

Figure 44 Transient simulation of a logic cell's temperature

Based on the simulation results presented above it can be stated that the two engines produce

the same results within an error of 5%. (MO7.3.10)

5.2.3 Real-world evaluation designs

5.2.3.1 Ring oscillator containing 1000 inverter cells

One of the test circuits was a digital ring oscillator containing 1000 inverter cells and one

kick-in nand cell. The technology node was TSMC 0.35µm. The structural Verilog

description of the circuit was generated by a script and then place & route was achieved in

Mentor Graphics Pyxis environment. The resulting layout is shown in Figure 45. The size of

the die resulted to 2.13x2.19 mm without pad ring.

Figure 45. Layout of the ring oscillator (1000 inverters)

Logi-thermal simulations were run with CellTherm until steady-state temperature of 0.193 °C

(temperature difference compared to initial temperature). The layer structure was silicon (500

µm) on top of glue (50 µm) on top of Kovar (nickel-cobalt ferrous alloy) (500 µm) with

Page 50: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 50

adiabatic boundary conditions at the sides and the top. Steady state temperature was reached

after 3.86 seconds. Power dissipation and thermal distribution maps are shown in Figure 46

and Figure 47.

Figure 46. Power dissipation (1000 inverters)

Figure 47. Temperature map (1000 inverters)

5.2.3.2 Test circuit from POLITO

CellTherm was also used to simulate a test design from Politectnico di Torino (POLITO) on

a STMicroelectronics 65nm technology node containing DSPs, LFSRs, multipliers and XOR

units. Layout of the synthesized, place & routed design is shown in Figure 48. Synthesis was

Page 51: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 51

done using Cadence Resource Compiler (RC), placement and routing was done by Cadence

Velocity (formerly Encounter). Die size resulted to 0.12x0.12 mm without pad ring.

Figure 48. Layout of POLITO circuit

Temperature difference in steady-state resulted to 0.188 °C after 4.56 seconds. Power and

temperature maps are shown in Figure 49 and Figure 50. The layer structure was silicone

(500 µm) on top of glue (50 µm) on top of Kovar (nickel-cobalt ferrous alloy) (500 µm) with

adiabatic boundary conditions at the sides and the top.

Figure 49. Power distribution in POLITO circuit

Page 52: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 52

Figure 50. Temperature distribution in POLITO circuit

In both of the designs the resulting temperature distribution was quite homogenous that is

caused by the evenly distributed cell network and the good heat transfer property of silicon. In

a design with a more scattered layout with different partitions of digital blocks (cache

memory, ALU, etc.) the effect of uneven heat distribution might be better observed. Also in a

mixed signal design with an analog and a digital part, the cross-influence of the analog and

digital parts could be simulated. In CellTherm, this could be achieved by substitution of the

analog part with a black-box where the analog part’s dissipation and placement on the layout

should only be known.

5.2.4 Verification of the electro-thermal engine

The benchmark circuit was the well-known µA741 operational amplifier (Figure 51). With

this the classical benchmark the effect of the thermal feedback from the output stage to the

two transistors of the differential pair of the input stage and how does this effect depend on

the actual physical realization (layout, packaging style). We studied two layout variants (from

two different vendors) as illustrated in Figure 52.

Figure 51 : Schematic of the µA741 operational amplifier

Page 53: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 53

.

The transistors marked with yellow circles are simulated using their electro-thermal model.

We were able to acquire this integrated circuit with two different layouts. This fact gave us

the ability to verify our electro-thermal solver at different thermal scenarios.

(A)

(B)

Figure 52 : The two different layouts for the µA741

(A)

(B)

Figure 53 Transfer characteristics with different loads

The simulated open-loop transfer characteristics can be seen with different loads (Figure 53).

These results demonstrate well that the structure of the physical layout (see Figure 52) has a

major effect on the electrical behaviour through thermal coupling.

Figure 54 shows the AC simulation of the same circuit in a feedback configuration together

with measurement results.

Page 54: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 54

(A)

(B)

Figure 54 Behaviour in the frequency domain with different loads

We have multiple information on these figures.

- The simulation and measurement results have an excellent matching. This verifies our

electro-thermal simulator.

- It is an interesting fact, that in certain feedback configurations, at the lower frequency

range we found frequency values where one of the samples behaved as ideal inductor

while the other as ideal capacitor. The only difference between the two was their

physical layout.

Configurable Level 0 Test Case

Figure 55 shows the main elements of our test case. It comprises a configurable RTL level

design and a configurable test bench which contains different scenarios to test the design. The

configuration file defines all important parameters related to the design. This file is used in

the RTL source code, in the test bench and also in the automatic floorplan generation script.

As described in the Error! Reference source not found., after the configuration has been

applied, the RTL design will be synthesized with standard cell library file. We then perform

post-synthesis simulation on the obtained gate-level design to measure switching activity of

the design when imposing different test scenarios to it. Switching activity data will then be

used for power consumption estimation of the design.

The floorplan file indicates the location of each section of the design on the chip. This file is

automatically generated by one of our scripts. The file should mainly be used during place &

route. However, in order to achieve more accurate results we can also synthesize the design in

topographical mode and thus utilize the floorplan during synthesis.

We are able to perform all of our design evaluation, timing/power and temperature analysis

using the generated gate-level design produced by synthesis engine. As a result, the

place&route step is not mandatory and might be done in order to just have more accurate

results.

Page 55: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 55

As introduced earlier we have developed a computational unit as extension of previous test

cases. This will be more suitable to explore thermal implication in a more realistic scenario

exploring the effects of temperature (such as hot-spot, and thermal gradients) within a larger

area design.

The configurable design is capable of doing basic floating point operations. These operations

are very common and widely used in every computational system. The unit is completely

parametric. The first set of parameters indicates the width of operands used in computations.

The second one specifies how many parallel instances of the computational unit should be

included in the design.

Figure 56 shows the hierarchy of the design and Error! Reference source not found. 57

shows its inputs and outputs.

Figure 55 Test case design process flow.

Page 56: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 56

Each component of the design, named dspunit in figure is composed by a set of sub-modules

executing different floating point operations. These are two operand addition (Addf), two

operand multiplication (Mulf), two operand division (Divf), exponential value calculation

(Expf) and natural logarithmic calculation (Lnf). These operations are typical of FPU units.

Table 4 shows the list of design important parameters and their description.

Table 4Configuration Design Parameters – Behavioural.

PARAMETER Description

DATA_WIDTH Total width of input operands to the unit. Each input operand is

a floating point number.

EXPONENT_WIDTH Width of exponent part of input floating point operand.

OPCODE_WIDTH Width of OpCode input port. Can be changed based on number

of operations that the computational unit performs.

STATUS_DATA_WIDTH Width of status output port.

NUMBER_OF_UNITS Total number of parallel dspunit instantiated in the design.

5 contains detailed description of the design ports.

Table 5 Input/Output ports description

PORT Direction Width Description

OpCode Input NUMBER_OF_UNITS*

OPCODE_WIDTH

Indicates the operation which should be

performed on the input operands. We use

one-hot encoding for this port. Each

OpCode activates one of (Addf, Divf, ...)

units.

In1 Input NUMBER_OF_UNITS*

DATA_WIDTH

Input floating point operand to the design.

Figure 56 Hierarchy of one

computational block.

Figure 57 Input/output ports of the

design

Page 57: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 57

In2 Input NUMBER_OF_UNITS*

DATA_WIDTH

Input floating point operand to the design.

Out Output NUMBER_OF_UNITS*

DATA_WIDTH

Output results of the computation as a

floating point value.

StatusOut Output NUMBER_OF_UNITS*

STATUS_DATA_WIDT

H

The status of the last floating point

operation done in the design.

Clk Input 1 Input clock port to the design.

ResetL Input 1 Input Synchronous, Active-Low reset.

Computations are performed in a pipeline manner, meaning that the unit can be fed with new

OpCode and new input values in each clock cycle. All of the floating point operations (Addf,

Divf, ...) will be performed during only one clock cycle.

Table 6 shows the result of synthesis done on the design when it has only one computational

unit but different values for DATA_WIDTH parameter. We use a typical standard cell library

(with a typical corner case) for performing synthesis.

We compare the complexity of the presented test case with a basic design such as an 8 bits

counter. For each selected DATA_WIDTH value we show the number of leaf cells required

for the design and also the critical path delay relative to 8-bits counter design.

Table 6 Design Complexity. The number of leaf cells and the critical path delay is given

relative to the 8-bit counter design

DATA_WIDTH EXPONENT_WIDTH Leaf Cells Critical Path Delay

16 4 353x 21.6x

32 8 736x 26.6x

48 16 1535x 34.4x

As early introduced, it is possible to instantiate as many number of parallel computational

units as required. By changing this number it is possible to evaluate the effect of temperature

gradients and hotspots in comparable area with real digital IPs designs.

The floorplan of the design, for placement and routing or for topographical synthesis will be

done in an automated fashion using the developed Python script. The script receives

parameters of the design and produces a TCL file which contains die size and area constraints

for each of the computational units as well as the sub-modules inside each unit.

Table 7 describes the parameters of the Python script.

Table 7 Configuration Design Parameters – Floorplan

Parameter Description

GENERATE_SUB_FLOORPLAN If true, the script produces area constraints for each

of the sum-modules (Addf, Divf,...) of each

computational unit

Page 58: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 58

NUMBER_OF_UNITS Number of total parallel computational units in the

design

UNIT_WIDTH Desired width of each computational unit area in

the floorplan

UNIT_HEIGHT Desired height of each computational unit area in

the floorplan

INTER_UNIT_DISTANCE Size of the empty space placed between adjacent

computational units in the floorplan.

Figure 58 – Figure 60 show the floorplan of the design and the sample place&route results. In

this example the floorplan specifies strict boundaries for each of computational units, but it

allows the place&route tool to place each of the internal sub-modules optimizing timing of the

design.

The developed automated test bench generator for the design produces a random stream of

floating point values as the inputs of each computational unit. It then specifies the OpCode

value for each unit. The OpCode value mainly indicates what task each computational unit

Figure 58: Floorplan of one unit of the

test case.

Figure 59: Amoeba view of the placed

design.

Figure 59: Floorplan of the design

containing 4 computational units.

Figure 60: Sample place&route result

for the design containing 4

computational units.

Page 59: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 59

should do. The test bench is easily configurable and can produce various different scenarios.

Table 8 shows currently available OpCode values and their operation.

Table 8 OpCode Description

OpCode Value Description

0x00 No operation. Data on inputs of computational unit

will not be read at all. No activity will happen.

0x01 Floating point multiplication.

0x02 Floating point addition.

0x04 Floating point exponential calculation.

0x08 Floating point division.

0x10 Floating point natural logarithm calculation.

The logic structure of each computational unit is so that, when it receives a new OpCode,

switching activity will happen only in the registers and gates related to the sub-module which

is responsible for that OpCode. No changes will happen in other sub-modules. Figure 860 61

shows a snapshot of Synopsys DVE tool, while analysing the gate-level simulation results of

the design.

The output of post-synthesis simulation will be compared against the output of functional

simulation to ensure the correctness of the results. The output VCD file produced by post-

synthesis simulation will be used to produce switching activity statistics of the circuit for each

specific time interval. The switching activity statistics will then be used to produce average

power consumption of the design for the target test scenario

5.3 Conclusion We have verified the logi-thermal simulators with SPICE simulations. Standard cell

dissipation and timing data were extracted from SPICE simulations using the equivalent

Figure 860: Figure 61Sample post-synthesis simulation waveforms of one computational unit when

applying different Op-Code values to the design.

Page 60: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 60

netlists of the cells. With the extracted data on transistor level the logi-thermal simulators

were able to calculate energy dissipations according to the switching activity of the circuit.

Delay values were also extracted from SPICE netlists and were used to determine temperature

dependent timing in the test circuit. We have extracted the average dissipation of the circuit

with SPICE for the first 64 µs. This time limit was chosen as an optimum where the circuit

had already been in every possible state and the average dissipation settled to a steady-state

value. The average power dissipated by each cell in the circuit were the input to the thermal

equivalent Foster RC network representing the die and the package. The Foster RC network’s

response to the input average powers was also simulated with SPICE which resulted in the

surface temperature of the cells. We compared these SPICE temperature results with the logi-

thermal engines’ results and measured a match below 0.16%.

Figure 62. Difference between CellTherm and SPICE results

To validate the logi-thermal simulator with another method we simulated a test circuit with

the logi-thermal simulators until steady-state temperatures were reached. This steady-state

temperature was saved for later use in a SPICE simulation. The test circuit was a ring-

oscillator where the oscillation frequency was dependent on the device temperature. On

steady-state temperature the oscillator frequency resulted to fosc,logi = 1.524 GHz in the logi-

thermal simulation. We then performed simulation of the same circuit in SPICE (using the

transistor level netlist) on the same steady-state temperature measured previously from the

logi-thermal simulator. The resulting frequency in the SPICE simulation resulted in fosc,SPICE =

1.453 GHz. The error in percents is 4.82%. With this method we validated that the

temperature dependent delay calculation in the logi-thermal simulations meet the expected 5%

accuracy.

The two logi-thermal engines developed in the project were also compared and the results

show that their outputs match within an error of 5%. (MO7.3.10)

The electro-thermal simulator was validated with measurements as well – the matching was

within 10%.

Page 61: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 61

6 Thermal effects in identification applications (NXP-D) 6.1 Introducion

NXP-DE has done the evaluation and validation of the simulation by means of the test chip

designed in WP4. The ambient influence on temperature distribution on chip level was

measured and simulated (see D6.1.2).

The design comprises several structures to force and measure thermal effects, see Figure 63.

The silicon measurements were done on basis of the specification described in deliverable

D4.1.3. It is the first time that the CMOS 40nm process with respect to the thermal behaviour

is verified. The deliverable D4.2.1 gives an overview about the specified parameters with

respect to thermal influence on self heating of analogue circuitries. This is measured and is

evaluated on silicon.

Figure 63: Testchip Layout

Based on these measurements the verification of the modelling and simulation technology

CAD (TCAD) by Synopsys was done.

6.2 Technical results To evaluate the impact of the thermal characteristic on packaged devices 2 types of packages

were evaluated: QFP100 and LCC84.

NXP Therminator test chips investigations were done in the same way on both packages. The

first assessment is to measure the self heating of the test chip itself. Therefore from both

packages QFP100 and LCC84 one device was left open as reference measurement. The open

devices are the reference for evaluating the influence of package on the self heating of the

silicon. The results of the open packages are discussed in WP4 deliverable D4.2.1.

The measurements of the bipolar test structure on the Therminator test chip proved that

measurement and simulations on the tools from Fraunhofer are matching. The self heating

effect can be seen as a local event at the power transistor settings up to 0.25W.

Page 62: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 62

The reference measurement of the diode voltage difference over temperature is showing a

behaviour fitting a linear equation: Temperature T=6620.431635°C/V*x-276.688°C. The

temperature dependency of the diode voltage is shown in Figure 64. It is obvious that the

constant term is nearly equal to 0K (≈-273°C). The difference of 3°C is related to a

measurement offset of the validation test bench environment. The gradient of the curve is

defined by the factor 6620.431635°C/V. Translated in mV, it means a gradient of 6.62°C/mV.

Figure 64: Measurement Result of Bipolar Difference over Temperature

This model is completely independent of process parameters, thus there is virtually no spread

from device to device. The following evaluations have been done based on this measured

curve between diode difference voltage and the measured temperature.

Figure 65 is shows the real measurement in comparison to the theory. It has to be noted that

the offset which was already discussed in the equation above is also visible in the chart.

Page 63: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 63

Figure 65: Measured bipolar trend curve in comparison to theoretical calculated curve

The measured curve is a parallel shift of the theoretical calculated curve. The red curve is

used in the next measurements as a reference to calculate the temperatures out of the voltage

differences of the bipolar transistor pairs.

The temperature distribution on silicon is shown in Figure 66.

At different power setting of the heater which is conceptually shown in the middle of the chart

the temperature distribution is having a high influence of self heating of the silicon itself. The

reference measurement where the heater is switched off is demonstrated in the blue curve.

The measured temperature is stable over all bipolar pairs at 39°C.

By switching on the heater, the temperature increases at the first bipolar pair located nearest

to the heater of about 10°C to 13°C. The distribution of the self heating effect that was

measured delivers a temperature decrease of 40C over a distance of 47µm.

It is seen that the self heating effect can be calculated locally on-chip. The influence of self

heating at distances of more than 100µm can be neglected. Also, it is seen that the self heating

effect on silicon is independent on the direction, as it is evenly distributed on the area.

Page 64: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 64

Figure 66: Measured self heating effects on distribution over silicon

Figure 67 compares the behaviour of the Therminator Test Chips in 2 operation modes: open

LCC84 (green curve) package and encapsulated LCC84 (purple curve).

Figure 67: Measurement results LCC84 open (green) and encapsulated LCC84 (purple)

The encapsulated device has a better heat dissipation in comparison to the open device. This

means that the package LCC84 reduces the self heating. The chart is also indicating that the

difference with respect to self heating is roughly 1°C.

The same measurement with a QFP100 package yields about the same result, see Figure 68.

Figure 68: Measurement results QFP100 open (blue) and encapsulated QFP100 (red)

Figure 69 illustrates the modelling flow of the thermal behaviour of the NXP testchip.

Page 65: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 65

Figure 69: NXP testchip with transistor heater and various sensor elements for

temperature sensing purpose (left part). 3D TCAD model of the heater and diode sensor

elements (right part).

The verification of the simulation model versus measurement is summarized in Figure 70. It

appears that at lower heating powers the measurement data seem to converge to a fixed value.

A parasitic resistor due to a non-ideal layout causes a ground-shift, which explains this

behaviour. At higher heating powers the effect of heater starts to dominate and a good

concurrence between simulation and measurement can be observed.

Figure 70: Temperature at different sensor locations (refer to Figure 10) as a function of

heater power. The filled circles indicate measurements negatively offset by 6 degrees

and the open diamonds are simulation results.

Page 66: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 66

6.3 Conclusions

The measurements of the NXP test chip and the results from Synopsys TCAD simulation and

modelling simulation tools match sufficiently (MO7.3.11). The characterization of the diode-

voltage over temperature is well in line with the theoretical expectation. The impact of the

encapsulation on the thermal behaviour with respect to the self-heating in silicon could be

demonstrated.

Page 67: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 67

7 Evaluation of simulation-based verification, optimization and

RSM model generation methodologies (MUN, together with

NXP-D, and ST) 7.1 Introduction

MUN worked in THERMINATOR project to integrate and improve their tools and

methodologies for thermal-aware design in the design environments and in close cooperation

with the industrial project partners such as ST[24][26][27], NXP [24] and IMC but also

institutes such as FHG and others [25] and EDA partners such as SNPS [28]. Target of this

collaboration has been to improve the capabilities of the design flow of the industrial project

partners to speed up their design process and receive better results. As part of the project

simulation-based methodologies to analyze and to reduce the impact of thermal fluctuations

on the behaviour, yield or reliability of analog/RF blocks have been developed and applied to

industrial test-cases. Simulation-based methodologies to analyze and reduce the impact of

thermal fluctuations on the behaviour, yield or reliability of analog/RF blocks have been

developed. The evaluation of the simulation-based verification, optimization and RSM model

generation methodologies has been evaluated by MunEDA in cooperation with ST and NXP

using different test-cases of the industry partners. Exploiting the test cases defined in Task 1.3

and the device models developed in WP2, the predictability of the impact of operating

conditions such as thermal variation on the behaviour of the circuits under tests has been

successfully assessed, demonstrated and documented.

7.2 Technical results MUN solutions and tools have been applied to several test-cases of the industrial project

partners in different process technologies such as 40nm and 28nm. In the project there have

been two test-cases by ST documented and two test-cases by NXP where MUN technology

has been used to analyse and improve the underlying circuit sizing problems. The test-cases

have been

- a double-ring oscillator consisting of a main PLL and dither PLL (ST) [26]

- a 2.133GHz Level Shifter in 28nm (ST) [27]

- a Sensor Ring Oscillator with power device transistors based on diode pairs in 40nm

(NXP) [24]

- a POR Power-on-Reset in 40nm (NXP) [24]

All test-cases and the underlying design techniques including using MUN methodologies

have been described in detail in deliverable 4.2.2.

Within the project activities could be achieved very good technical results:

- The project partners such as ST and NXP have been able to speed up their design and

sizing time as well received better and more reliable design results

- For several of such test-cases this has also been silicon proven.

- Correlations could be measured between simulation and silicon results that lead to

further enhancements in the evaluated circuit design methodologies especially for

thermal modelling and temperature-influenced effects on the circuits

- Enhanced statistical analysis methods such as Monte-Carlo analysis under worst-case

temperature and corner conditions as well as deterministic Worst-Case-Distance

methods have been applied to the investigated circuits

Page 68: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 68

The results have been also shown as measurable objectives in the relevant WP7 reports for 7.3

as follows:

- Integration of MunEDA WiCkeD tools within NXP-D design environment (MO7.3.11)

- Integration of MunEDA WiCkeD tools within ST design environment [24][26][27]

- Integration of MunEDA WiCkeD tools with SNPS custom design environment tools

[28]

- Yield improvement by up to 90% (MO7.3.12, as part of THERMINATOR project

objective #4: demonstration of the applicability and effectiveness of the new design

solutions through manufacturing of test-chips featuring leading-edge silicon

technology, as available from some of the project partners).

The above figures show the yield optimization from initial value with not fulfilling the

specification on the left side and the result after using the tool YOP for yield

optimization of all given performances against the process variations. Manual sizing

could not find an appropriate solution as the underlying design problem has been quite

complex and can be followed-after only with huge simulation effort that is very costly.

- Quantified reduction in design-time from 2 weeks to 3 hours (MO7.3.13, as part of

THERMINATOR project objective #5: demonstration of the usability and

effectiveness of the new design methodologies and tools by their application to

industry-strength design cases made available by some of the project partners)

The above picture shows the device sensitivities within the complex sizing problem of

the level shifter in 28nm STMicroelectronics technology [27]. The systematic

automatic approach has the advantage that the optimization algorithm are much more

efficiency to find the best result while taking into account all constraints compared to

Page 69: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 69

manual sizing that has to follow a step-by-step and trial-and-error strategy that can last

very long time.

7.3 Conclusions The THERMINATOR project has been very fruitful for MUN especially based on the

excellent collaboration with the project partners to enhance the methods and tools for use in

industrial design environments. It has been possible to fulfil a comprehensive, chip-level

assessment of the impact of thermal variations on the behaviour of the heterogeneous

electronic systems. For this reason it has been crucial to include into the analysis the

contribution of analog and RF blocks. MUN provided simulation-based methodologies to

analyze and optimize analog/RF circuits at transistor level. Based on the examination results

of the project partners ST and NXP, MUN has extended these methodologies in close

cooperation with ST and NXP but also with EDA Partner SNPS. The industry partners have

integrated these methodologies into their design environments and applied the circuit analysis

methodologies provided by MUN to carry out sensitivity analysis of their automotive and

identification system designs especially of the analog, power and RF components for their

ICs.

MUN will exploit the achieved results as described in the THERMINATOR dissemination

and exploitation plan.

Page 70: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 70

8 Conclusions

T7.3 is the final task of WP7, in which the EDA tools developed in WP3, 4, and 6 are

evaluated. These design tools and flows have been developed for different application fields,

from digital to RF. Each of these application domains has of course specific requirements for

their tools, because the size and the functionality of the circuits are different. The evaluation

of the EDA prototype tools has concentrated on two aspects. On one hand, the accuracy, ease

of use in terms of speed, and integration within existing design flows has been addressed. In

particular, this type of evaluation has been done for design flows based on new concepts

and/or tools for complex systems. Examples include the logi-thermal simulation tools of

BME, tools for identification applications of NXP-D, and the thermal models from IMEC for

complex three-dimensionally integrated chips. On the other hand, the effectiveness of the

developed tools has been demonstrated via improvement of the designs. Examples include

control of thermally induced delays, reduction in clock screw by POLITO, smaller threshold

voltage shifts by thermal- and aging-aware optimization by OFFIS, and yield improvement

and design time reduction by MUN. The demonstration has been focused on examples

provided by ST and NXP-D, i.e. the major industrial partners in T7.3. This way of working

has resulted in direct benefits for the European semiconductor industry with respect to the

competition. The advantages for EDA vendors are that they have been able to test their tools

on industrial-strength design cases, and that they have a better understanding of the challenges

that the semiconductor industry is facing. Finally, the results of T7.3 have been published at

many conferences, demonstrating the novelty of the work described in this deliverable.

Page 71: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 71

9 Measurable objectives The Therminator project objectives are

1. New modelling and simulation capabilities to support accurate circuit thermal analysis

and simulation

2. Innovative thermal-aware design techniques, methodologies and prototype tools for

controlling, compensating and managing thermally-induced effects on parameters

such as timing, (dynamic and leakage) power, reliability and yield

3. Demonstration of the accuracy and ease of integration within existing design flows of

the new models by validation against measured data obtained on ad-hoc silicon

structures

4. Demonstration of the applicability and effectiveness of the new design solutions

through manufacturing of test-chips featuring leading-edge silicon technology, as

available from some of the project partners

5. Demonstration of the usability and effectiveness of the new design methodologies and

tools by their application to industry-strength design cases made available by some of

the project partners

In order to quantify the output of the evaluation work presented in this deliverable,

measurable objectives have been defined. In the table below, all the measurable objectives of

T7.3 are summarized. In this table, it is also shown to which test case of WP1, and to which

Therminator’s project objective the measurable objectives are related. Since T7.3 focuses on

evaluation of EDA-Tools and test-chips, the majority of the measurable objectives are related

to Therminator’s project objectives 3-5.

Area Measurable

objective

Innovation Metric Quantification Test case Project

objective

Evaluation

of

thermal-

aware

design

prototype

tools

MO7.3.1 IMEC Evaluate one

integrated tool allowing

early system floor planning

and exploration of many

system and physical options

and their impact on thermal

behaviour

Accuracy versus

measurement data,

error less than 5C

[IMEC]

2 layer

DRAM-on-

Logic chip

stack inside

FC-BGA

package

Test-case 6

4

MO7.3.2 Thermal-aware Design

prototype tools ease of use

Set-up time: 2 days for

very complex real-

world examples; few

hours for typical high-

level designs [IMEC]

2 layer

DRAM-on-

Logic chip

stack inside

FC-BGA

package

Test-case 6

3

MO7.3.3 Speed of the complete

design flow for thermal-

aware Design prototype

tools

Flow speed: couple of

hours for layout, few

minutes for thermal

[IMEC]

2 layer

DRAM-on-

Logic chip

stack inside

FC-BGA

package

Test-case 6

4

MO7.3.4 Thermal-aware Design prototype Accuracy of prototype 2 layer 3

Page 72: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 72

tools accuracy tools 15% compared to

final layout generation

DRAM-on-

Logic chip

stack inside

FC-BGA

package

Test-case 6

MO7.3.5 Development of a

temperature-insensitive

multi-Vth

synthesis methodology.

Circuit’s temperature

insensitivity.

[POLITO]

Test-case 4 2

MO7.3.6 Multi-Vth assignment

algorithms for

temperature insensitive

synthesis flow

Comparison with

circuits obtained by

traditional synthesis

methodologies

Temperature induced

Delay variation kept

below the variation

reported in the

simulation results in the

range 25-125°C (less

than 5%)

[POLITO]

Test-case 4 3

MO7.3.7 Clock skew minimization

thanks to mechanisms

compensating for thermally-

induced delays

Clock skew comparison

prior

and post optimization

15% of clock skew

reduction in the range

25-125°C

[POLITO]

Test-case 4 2

MO7.3.8 Support aging simulation

with Synopsys tools

Modifications of aging

model required for

aging simulation with

Synopsys tools (yes/no)

[SNPS-AM]

Test-case 1,

2

3

MO7.3.9 Evaluation of a high-level

thermal and aging aware

estimation and optimization

flow

Comparison of the

thermal estimations

against the estimations

done by well known

low-level tools.

(yes/no)

Quantification of the

improvements reached

through the thermal and

aging aware

optimizations (yes/no)

[OFFIS]

Test-cases

4, 5

3

MO7.3.10 Logithermal simulations on

a sample circuit of ST using

two different simulation

methods

Maximal temperature

values, maximum

temperature locations,

and timing shifts of

both simulation

methods to agree

within 5% [BME]

Sample

circuit made

in

collaboratio

n with

POLITO

with ST

STD

1

Page 73: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 73

CELLS

library*

MO7.3.11 Create analog test chip and

evaluate measurement

results [NXP-D]

Compare measurement

data from analog test

chip with simulation

results (yes/no) [MUN]

Test Chip

NXP

5

MO7.3.12 Demonstration of the

applicability and

effectiveness of the new

design solutions through

manufacturing of test-chips

featuring leading-edge

silicon technology, as

available from some of the

project partners

- Integration of

MunEDA

WiCkeD tools

within NXP

design

environment

- Yield

improvement

by 90%

POR power

on reset

(40nm

CMOS

Technology)

[NXP,

MunEDA]

4

MO7.3.13 Demonstration of the

usability and effectiveness of

the new design

methodologies and tools by

their application to industry-

strength design cases made

available by some of the

project partners:

- Integration of

MunEDA

WiCkeD tools

within ST

design

environment

- Quantified

reduction in

design-time

from 2 weeks

to 3 hours

2.133GHz

Level

Shifter

(28nm

CMOS

Technology)

[STM,

MunEDA]

5

MO7.3.14 Create test chip with

temperature sensors to

evaluate 2D SOCS, and

evaluate measurement

results [LETI]

Provide floor plan, and

data needed for

simulations of test chip

in March 2012 [LETI]

Compare measurements

data against

simulations(yes/no)

[OFFIS]

Test chip

LETI

Test-case 5

3

*The test case has been used to have common circuit in the collaboration between POLITO

and BME. It was originally also planned that BME evaluate the simulator on testcase 4

provided by ST. All the necessary actions have been made: NDA and setup of a dedicated

environment in ST Catania to guest BME researchers. Unfortunately some procedures took

longer than expected and the opportunity to have BME researchers at ST premises would have

been in January 2013, too late for allowing the BME organisation to ask for the trip

reimbursement.

The substitute of testcase 4 is still an industry like strength testcase, because of its complexity

it cannot be classified as an academic testcase , and has the added value that it

Has fostered the collaboration between BME, POLITO. POLITO has also validated its tools on

the testcase4 and the results are coherent with those obtained on the sample testcase, this lead

us to be very positive also on the BME logithermal simulator.

Page 74: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 74

10 Publications and presentations

A. Burenkov, J. Lorenz, “Self-heating effects in nano-scaled MOSFETs and thermal aware

compact models”, THERMINIC, 17th International Workshop on Thermal investigations of

ICs and Systems, Paris, 27-29 Sept. 2011, EDA Publishing, pp. 17-18.

FHG, “Tutorial on TCAD Simulations of Nano-CMOS Including Self-Heating” was held at

IMC in September 2011 at IMEC in March 2012

F. Beneventi, A. Bartolini, L. Benini, “Static Thermal Model Learning for High-Performance

Multicore Servers”, Computer Communications and Networks (ICCCN), 2011 Proceedings of

20th International Conference on, Issue Date: July 31 2011-Aug. 4 2011, On page(s): 1 – 6,

Location: Lahaina, HI, USA, ISSN: 1095-2055 Print ISBN: 978-1-4577-0637-0, 2011 IEEE

A. Bartolini, M. Cacciari, A. Tilli, L. Benini, “A distributed and self-calibrating model-

predictive controller for energy and thermal management of high-performance multicores”,

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011, Grenoble,

France 14-18 March 2011, On page(s): 1 – 6, ISSN : 1530-1591, Print ISBN: 978-1-61284-

208-0. IEEE Press 2011

A. Bartolini, M. Sadri, F. Beneventi, M. Cacciari, A. Tilli, L. Benini, “SCC Thermal Sensor

Characterization and Calibration”, 3rd Many-core Applications Research Community

(MARC) Symposium, Ettlingen, Germany, Issue Date : 5-6 June 2011, On page(s): 7-12, KIT

Scientific Publishing 2011, ISBN 978-3-86644-717-2

A. Bartolini, M. Sadri, F. Beneventi, M. Cacciari, A. Tilli, L. Benini, “A System Level

Approach to Multi-core Thermal Sensors Calibration”, Integrated Circuit and System Design.

Power and Timing Modeling, Optimization, and Simulation, Editor: Ayala J., García-Cámara

B., Prieto M., Ruggiero M., Sicard G., Book Series Title: Lecture Notes in Computer Science,

Page(s): 22- 31, Volume: 6951, Copyright: 2011, Publisher: Springer Berlin / Heidelberg,

ISBN: 978-3-642-24153-6

M. Sadri, A. Bartolini, L. Benini, “Single-Chip Cloud Computer thermal model”, Thermal

Investigations of ICs and Systems (THERMINIC), 2011 17th International Workshop on,

Paris, France 27-29 Sept. 2011, On page(s): 1 – 6, Print ISBN: 978-1-4577-0778-0, IEEE

Press 2011

A. Sassone, A. Calimera, A. Macii, E. Macii, M. Poncino, R. Goldman, V. Melikyan, E.

Babayan, S. Rinaudo, “Investigating the Effects of Inverted Temperature Dependence (ITD)

on Clock Distribution Networks”, Proceedings of Design, Automation & Test in Europe

(DATE’12) conference, Dresden, Germany, 2012.-P.165-167

Wei Liu, V. Tenace, A. Calimera, A. Macii, E. Macii, M. Poncino, “NBTI Effects on Tree-

Like Clock Distribution Networks”, GLSVLSI-12: accepted for publication

A. Sassone, W. Liu, A. Calimera, A. Macii, E. Macii, M. Poncino, “Modeling of thermally

induced skew variations in clock distribution network”, THERMINIC-11: IEEE Thermal

Investigations of ICs and Systems, 2011.

L. M. de Lima Silva, A. Calimera, A. Macii, E. Macii, M. Poncino, "Power Efficient

Variability Compensation Through Clustered Tunable Power-Gating", IEEE Journal on

Emerging and Selected Topics in Circuits and Systems, vol.1, no.3, Sept. 2011

M. Caldera, A. Calimera, A. Macii, E. Macii, M. Poncino , “Minimizing temperature

sensitivity of dual-Vt CMOS circuits using Simulated-Annealing on ISING-like models”,

THERMINIC-10: IEEE Thermal Investigations of ICs and Systems, 2010.

Page 75: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 75

A. Calimera, A. Macii, E. Macii, S. Rinaudo, M. Poncino, “THERMINATOR: Modeling,

control and management of thermal effects in electronic circuits of the future”, THERMINIC-

10: IEEE Thermal Investigations of ICs and Systems, 2010.

A. Calimera,R. Bahar, E. Macii, M. Poncino, “Temperature-Insensitive Dual-Vth Synthesis

for Nanometer CMOS Technologies Under Inverse Temperature Dependence”, IEEE

Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 11, 2010

A. Timár, M. Rencz, “Studying the influence of chip temperatures on timing integrity”, 12th

IEEE Latin-American Test Workshop. Brazil, 27-30. 03. 2011.

A Timár, M. Rencz, “Studying the Influence of Chip Temperatures on Timing Integrity Using

Improved Power Modeling”, JOURNAL OF LOW POWER ELECTRONICS 7: pp. 1-10.

(2011)

A Timár, Gy. Bognár, M. Rencz, “Improved power modeling in logi-thermal simulation”,

17th International Workshop on Thermal investigations of ICs and Systems. Paris, Paris,

France, 27-29. 09. 2011.

Gergely Nagy, András Poppe, “A Novel Simulation Environment Enabling Multilevel Power

Estimation of Digital Systems”, Proceedings of the 17th International Workshop on THERMal

INvestigation of ICs and Systems

G. Gangemi, “FP7-Funding Projects THERMINATOR, SMAC, MANON Overview”, MUGM

MunEDA User Group Meeting 2012, October 2012, Munich, Germany

Z. Abbas, M. Olivieri, A. Ripp, G. Strube, M. Yakupov, “Yield optimization for low power

current controlled current conveyor”, SBCCI 2012, September 2012, Brasília, Brazil

Colaci, G. Boarin, A. Roggero, L. Civardi, C. Roma, A. Ripp, M. Pronath, G. Strube:

“Systematic Analysis & Optimization of Analog/Mixed-Signal Circuits Balancing Accuracy

and Design Time”, SBCCI 2011 Brazil, September 2011, Sao Paolo, Brazil

N. Seller, “Optimization of a 2.133GHz level shifter in 28nm”, MUGM MunEDA User Group

Meeting 2011, Munich, Germany

U. Trautner, M. Pronath, “Synopsys Custom and Analog Mixed-Signal Overview & MunEDA

WiCkeD Integration”, MUGM MunEDA User Group Meeting 2010, Munich, Germany

S. Coparale, R. Rvatti, G. Setti, "Representation of PWM signals through time warping",

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on ,

vol., no., pp.3589-3592, 25-30 March 2012.

Reef Eilers, Malte Metzdorf, Sven Rosinger, Domenik Helms, Wolfgang Nebel, “Phase space

based NBTI model”, Proc. of International Workshop on Power and Timing Modeling,

Optimization and Simulation (PATMOS), 2012

Sven Rosinger, Malte Metzdorf, Domenik Helms, Wolfgang Nebel, “Behavioral-Level

Thermal- and Aging-Estimation Flow”, Proc. Of 12th Latin-American Test Workshop (LATW),

p. 1-6, 2011

V. Melikyan, A. Gevorgyan, A. Baghdasaryan, H. Melikyan, “Thermal Via’s Placement

Zones Identifying Using Voronoi Diagrams”, Proceedings of the 32th International Scientific

Conference Electronics and Nanotechnology (ELNANO 2012), Kiev, Ukraine, 2012.-P.77-79

V. Melikyan, Babayan E., Harutyunyan A., Melikyan N., Zargaryan G., “Method of

Reducing Thermal Dependence of Timing Delays of Digital Integrated Circuits”, Proceedings

of 5th All-Russian scientific-technical conference “Problems of Developing Advanced Micro-

and Nanoelectronic Systems -2012” (MES-2012), Moscow, Russia, 2012. –P409-412

Page 76: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 76

V.Sh. Melikyan, A.A. Durgaryan, A.H. Balabanyan, E.H. Babayan, M. Stanojlovic, A.G.

Harutyunyan, “Process-voltage-temperature Variation Detection and Cancellation Using On-

Chip Phase-Locked Loop”, Proceedings of the 56th Electronics, Telecommunications,

Computers, Automatic Control and Nuclear Engineering (ETRAN) Conference, Zlatibor,

Serbia, 2012.-P.EL1.2-1-4

P. Bibilo, A. Solovev, V. Melikyan, A. Harutyunyan, E. Babayan, “Estimation of Power

Consumption of Digital CMOS Circuits Based on Logic Simulation of their Structural

Descriptions”, Proceedings of Engineering Academy of Armenia, Yerevan, Armenia, 2012. –

P.600-610

V. Melikyan, E. Babayan, A. Harutyunyan, “Pattern-Based Approach to Current Density

Verification”, Proceedings of the 4th Small Systems Simulation Symposium 2012, Nis,

Serbia, 2012.-P.58-61

V. Melikyan, A. Balabanyan, E. Babayan, A. Durgaryan, “Decreasing of Frequency

Variation in High-Speed Ring Oscillator using Bandgap Reference”, Proceedings of the 32th

International Scientific Conference Electronics and Nanotechnology (ELNANO 2012), Kiev,

Ukraine, 2012.-P.79-81

R. Roldman, K. Bartleson, T. Wood, V. Melikyan, E. Babayan, “Synopsys’ Low Power

Design Educational Platform”, Proceedings of the 9th European Workshop on

Microelectronics Education (EWME 2012), Grenoble, France, 2012.-P.23-26

V. Melikyan, E. Babayan, A. Harutyunyan, “Pattern-Based Approach to Current Density

Verification”,/ Electronics, Faculty of Electrical Engineering, University of Banja Luka,

Volume 16, Number 1, Serbia, 2012.-P.77-82

V. Melikyan, A. Harutyunyan, “Modeling of IC Interconnects and Power Rails”, Chartarapet,

Yerevan, 2012 (in Armenian)

V. Melikyan, A. Durgaryan, A. Khachatryan, H. Manukyan, E. Musayelyan, “Self-

compensating Low Noise Low Power PLL Design”, Proceedings of IEEE East-West Design &

Test Symposium (EWDTS’12), Kharkov, Ukraine, 2012.-P.29-33

V.Sh. Melikyan, S.V. Gavrilov, V.K. Aharonyan, N.K. Aslanyan, A.S. Hovhannisyan, “On-

die CMOS Termination Resistor for USB Transmitter”, RAs National Academy of Science

and SEUA, Yerevan, RA, Vol. 65, N 3, Yerevan, 2012.-P. 295-304

P. Magnone, C. Fiegna, G. Greco, G. Bazzano, E. Sangiorgi, S. Rinaudo, “Modeling of

Thermal Network in Silicon Power MOSFETs”, Ultimate Integration on Silicon (ULIS), 14-16

Marzo 2011, Cork, Ireland.

P. Magnone, C, Fiegna, G. Greco, G. Bazzano, S. Rinaudo, E. Sangiorgi, “Numerical

Simulation and Modeling of Thermal Transient in Silicon Power Devices”, Ultimate

Integration on Silicon (ULIS), pp. 153-156, 6-7 March 2012,. Grenoble (France).

P. Magnone, C, Fiegna, G. Greco, G. Bazzano, S. Rinaudo, E. Sangiorgi, “Numerical

Simulation and Modeling of Thermal Transient in Silicon Power Devices”, ELSEVIER Solid-

State Electronics, in press.

Page 77: Evaluation of thermal-aware design prototype tools · Evaluation of thermal-aware design prototype tools Prepared by M. Willemsen ... Title Evaluation of thermal-aware design prototype

THERMINATOR FP7 ICT – 2009.3.2 - 28603 D7.3.1

Page 77

H. Oprins, V. Cherman, B. Vandevelde, M. Stucchi, G. Van der Plas, P. Marchal, and E.

Beyne, “Steady state and transient thermal analysis of hot spots in 3D stacked ICs using

dedicated test chips”, 27th Annual IEEE Thermal Measurement, Modeling and Management

Symposium (SEMI-Therm), March 20-24, 2011, 131-137.

H. Oprins, V. Cherman, B. Vandevelde, C. Torregiani, M. Stucchi, G. Van der Plas, P.

Marchal, and E. Beyne, “Characterization of the Thermal Impact of Cu-Cu bonds achieved

using TSVs on hot spot dissipation in 3D stacked ICs”, Proceedings of ECTC, May 30- June 1,

2011, 861-868.

H. Oprins, V. Cherman, “Numerical and experimental characterization of hot spot dissipation

in 3D stacks”, Electronics Cooling Magazine, Vol. 18(2), 2012, pp. 18-23.

D. Milosevic, H. Oprins, J. Ryckaert, P. Marchal, G. Van der Plas, “DRAM-on-logic Stack –

Calibrated Thermal and Mechanical Models Integrated into A Design Flow”, IEEE Custom

Integrated Circuits Conference (CICC), September 18-21 2011, San Jose, California, invited.

H .Oprins, V. Cherman, B. Vandevelde, G. Van der Plas, P. Marchal, and E. Beyne,

“Numerical and experimental characterization of the thermal behavior of a packaged DRAM-

on-logic stack”, 62nd

Electronic Components and Technology Conference - ECTC, 2012, pp.

1081-1088.

Gergely Nagy, László Pohl, András Timár, András Poppe, “Yield enhancement by logi-

thermal simulation based testing”, Proceedings of the 18th International Workshop on

THERMal INvestigation of ICs and Systems (THERMINIC'12). Budapest, Hungary,

2012.09.25-2012.09.27. pp. 196-199. Paper 42.

Gergely Nagy, András Timár, Albin Szalai, Márta Rencz, András Poppe, “New simulation

approaches supporting temperature-aware design of digital ICs”, Proceedings of the 28th

IEEE Semiconductor Thermal Measurement and Management Symposium (SEMI-

THERM'12). San Jose, USA, 2012.03.18-2012.03.22. pp. 313-318.(ISBN: 978-1-4673-1109-

0)

A Timar, M. Rencz, “Temperature dependent timing in standard cell designs”, Proceedings of

the 18th International Workshop on THERMal INvestigation. Budapest, Hungary,

2012.09.25-2012.09.27. pp. 179-183.

A Timar, M. Rencz, “Real-time heating and power characterization of cells in standard cell

designs”, MICROELECTRONICS JOURNAL (2012)IF: [0.919*]

A Timar, M. Rencz, “Acquiring real-time heating of cells in standard cell designs”,

Proceedings of the 13th IEEE Latin-American Test Workshop (LATW'12). Quito, Ecuador,

2012.04.10-2012.04.13. pp. 121-125.

Gergely Nagy, András Poppe, “Simulation Framework for Multilevel Power Estimation and

Timing Analysis of Digital Systems Allowing the Consideration of Thermal Effects”,

Proceedings of the 13th IEEE Latin-American Test Workshop (LATW'12). Quito, Ecuador,

2012.04.10-2012.04.13. pp. 1-5.