© Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues...

50
© Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1. Motivation (Kevin) 2. Thermal issues (Kevin) 3. Power modeling (David) 4. Thermal management (David) 5. Optimal DTM (Lev) 6. Clustering (Antonio) 7. Power distribution (David) 8. What current chips do (Lev) 9. HotSpot (Kevin)

Transcript of © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues...

Page 1: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Overview

1. Motivation (Kevin)2. Thermal issues (Kevin)3. Power modeling (David)4. Thermal management (David)5. Optimal DTM (Lev)6. Clustering (Antonio)7. Power distribution (David)8. What current chips do (Lev)9. HotSpot (Kevin)

Page 2: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

PowerPC G3 Microprocessor

• On-chip temperature sensor (junction temperature)– Based on differential voltage change across

2 diodes of different sizes – Implemented in PowerPC G3/G4 processors

• OS required for control• Instruction Cache Throttling used to

dynamically lower junction temperature

Page 3: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Pentium III

• On-die thermal diode– Coupled with board-level thermal diode

sensor

• Uses– Monitor long-term temperature and

environmental trends– Provide indication of catastrophic failure

Page 4: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Pentium 4

• Thermal ramp rates ~50ºC/second(over whole package)

• Much too high for coarse-grained solutions

• Thermal Monitor– Highly-accurate on-die temperature sensing

circuit– Fast acting temperature control circuit

(~50ns)

Reference CurrentSource

Temperature Sensing Diode

CurrentComparator

PROCHOT#

Page 5: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Pentium 4 -- Thermal Monitor

• Trip Point is calibrated at manufacturing time

• Simple response– Turn processor clocks on/off at 50% duty cycle– For 1.5GHz processor, ~2s on + ~2s off?

Page 6: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Pentium 4 -- Results

• For 200 traces (TPC-C, SPEC, Microsoft)– Thermal design point can be reduced to 75%

of true “max power” with minimal performance loss

Page 7: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Pentium 4

• Thermal monitors allow– Tradeoff between cost and performance– Cheaper package

• More triggers, Less Performance

– Expensive package• No triggers, no performance loss

Page 8: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Architecture-level Thermal Management

• Dynamically adjust execution to control temperature

• Avoid catastrophic failure (heat sink, fan)• Permit use of less expensive package

– Design for less than the worst case– Package costs ~$1/W above ~40 W– Heat sinks, heat pipes, thinned wafers, fans

• Fans reduce battery life– Peak power as high as 150 W now and > 200W in 1-2

generations– Temperatures over 100°C

• More fundamentally -- there is a need for architecture-level thermal modeling– What’s actually going on in there?

Page 9: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

HotSpot project

• Collaboration between HPLP and LAVA Labs (ECE and CS depts. UVa)

• Deal with “hot spots”– Localized heating occurs much

faster than chip-wide• microsec. to millisec.

– Chip-wide treatment is too conservative• seconds to minutes• but there is significant lateral

thermal coupling through the package

• How do we model this?• Prove temperature will be safely bounded

Page 10: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Hot spots in Power4

Temperature “landscape”: space and timeHow to estimate early in the design cycle?

Page 11: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Thermal modeling• Want a fine-grained, dynamic model of

temperature– At a granularity architects can reason about– That accounts for adjacency and package– That does not require detailed designs– That is fast enough for practical use

• HotSpot - a compact model based on thermal R, C– Parameterized to automatically derive a

model based on various• Architectures• Power models• Floorplans• Thermal Packages

Page 12: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Dynamic compact thermal model

Electrical-thermal duality

V temp (T)I power (P)R thermal resistance (Rth)C thermal capacitance (Cth)RC time constant (Rth Cth)

Kirchoff Current Lawdifferential eq. I = C · dV/dt + V/Rthermal domain P = Cth · dT/dt + T/Rthwhere T = T_hot – T_amb

At higher granularities of P, Rth, CthP, T are vectors and Rth, Cth are circuit matrices

T_hot

T_amb

Page 13: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Package we model

Heat sink

Heat spreader

PCB

Die

IC Package

Pin

Interface material

Page 14: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Modeling the package

• Thermal management allows for packaging alternatives/shortcuts/interactions

• HotSpot needs a model of packaging• Basic thermal model:

– Heat spreader– Heatsink– Interface materials (e.g. phase-change films)– Fan/Active cooler (TEC)

• Thermal resistance due to convection• Constriction and bulk resistance for fins• Spreading constriction and bulk resistance for

heatsink base and heat spreader• Thermal resistance for bonding material• Thermal capacitance heat spreader and

heatsink

Page 15: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

“Optimal” package

• Default package is found using:– Power dissipation– Target temperature on chip– Chip area– Clock speed – high or low performance

• Power dissipation and target temperature used to determine resistance value needed

• Needs more work: modern packages are incredibly complex, yet there is still a need to model at higher levels

Now: what can we do with HotSpot?

Page 16: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Equivalent vertical network

Chip

Interface

Spreader

Interface + Sink

Convection

Peripheral spreader nodes

• Diagram is simplified – peripheral nodes

Page 17: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Vertical network parameters

• Resistances– Determined by the corresponding areas

and their cross sectional thickness– R = resistivity x thickness / Area

• Capacitances– C = specific heat x thickness x Area

• Peripheral node areas

Chip

Spreader

North

South

EastWest

Page 18: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Lateral resistances

• Determined by the floorplan and the length of shared edges between adjacent blocks

– "Heat Spreading and Conduction in Compressed Heatsinks", Jaana Behm and Jari Huttunen, in proceedings of the 10th International Flotherm User Conference, May 2001.

Page 19: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Lateral resistances – contd...• Lengths used for silicon

• Lengths used in the spreader

Page 20: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Our model (lateral and vertical)

Interface material(not shown)

Page 21: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Temperature equations

• Fundamental RC differential equation– P = C dT/dt + T / R

• Steady state– dT/dt = 0– P = T / R

• When R and C are network matrices– Steady state – T = R x P – Modified transient equation

• dT/dt + (RC)-1 x T = C-1 x P

– HotSpot software mainly solves these two equations

Page 22: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

HotSpot

• Time evolution of temperature is driven by unit activities and power dissipations averaged over 10K cycles– Power dissipations can come from any power

simulator, act as “current sources” in RC circuit ('P' vector in the equations)

– Simulation overhead in Wattch/SimpleScalar: < 1%

• Requires models of– Floorplan: important for adjacency– Package: important for spreading and time

constants– R and C matrices are derived from the above

Page 23: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Implementation

• Primarily a circuit solver • Steady state solution

– Mainly matrix inversion – done in two steps• Decomposition of the matrix into lower and upper

triangular matrices• Successive backward substitution of solved

variables

– Implements the pseudocode from CLR

• Transient solution– Inputs – current temperature and power – Output – temperature for the next interval– Computed using a fourth order Runge-Kutta

(RK4) method

Page 24: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Transient solution

• Solves differential equations of the form dT + AT = B where A and B are constants– In HotSpot, A is constant but B depends on the

power dissipation– Solution – assume constant average power

dissipation within an interval (10 K cycles) and call RK4 at the end of each interval

• In RK4, current temperature (at t) is advanced in very small steps (t+h, t+2h ...) till the next interval (10K cycles)

• RK – `4` because error term is 4th order i.e., O(h^4)

Page 25: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Transient solution contd...

• 4th order error has to be within the required precision

• The step size (h) has to be small enough even for the maximum slope of the temperature evolution curve

• Transient solution for the differential equation is of the form Ae-Bt with A and B are dependent on the RC network

• Thus, the maximum value of the slope (AxB) and the step size are computed accordingly

Page 26: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Validation

• Validated and calibrated using MICRED test chips– 9x9 array of power dissipators and sensors– Compared to HotSpot configured with same

grid, package

• Within 7% for both steady-state and transient step-response– Interface material (chip/spreader) matters

Page 27: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Current features

• Specification of arbitrary floorplans• Format of floorplan file:

– One line per unit– Line format – <unit-name> \t <width> \t

<height> \t <left-x> \t <bottom-y> \n

• Takes a power trace file as an input and outputs corresponding temperature trace

• Ability to modify package specifactions (type of interface material, size and type of heat spreader and heat sink etc.)

Page 28: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Current floorplan

•Modeled after an Alpha 21364

Page 29: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Current floorplan – CPU core

Page 30: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Soon to be features

• Grid model – RC network per grid cell instead of a block

• Temperature models for wires, pads and interface material between heat sink and spreader

• Better (more user friendly) floorplan specification

• Automatic floorplan generation using classical floorplanning algorithms

Page 31: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Better floorplan specification• Floorplan of current microprocessors has a

structural similarity

• Floorplans similar to MIPS R10K, Pentium and the Alpha 21264

• Pipeline order corresponds to floorplan adjacency

Page 32: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Better floorplan specification

• Sample specification (with % areas) that takes advantage of pipeline order

Page 33: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Automatic floorplan for architects• Why develop an architectural

floorplanning tool?– Thermal modeling requires adjacency

information.– Wire delays make performance depend on

the floorplan.• Goal

– Derive a realistic floorplan using only microarchitectural information

– Trade off thermal efficiency against latency– Simulated annealing based floorplan

optimization for thermal, delay and combined metrics

• Current work. Results will be available soon

Page 34: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Sensors

Caveat emptor: We are not well-versed on sensor

design; the following is a digest of information we have been able to collect from industry sources and the research literature.

Page 35: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Desirable Sensor Characteristics

• Small area• Low Power• High Accuracy + Linearity• Easy access and low access time• Fast response time (slew rate)• Easy calibration• Low sensitivity to process and supply

noise

Page 36: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

PowerPC G3

• (Sanchez et al, Symp. on VLSI Circuits ‘97, COMPCON ‘97)

• 0.35 μ, 2.5V• Area 0.2 mm2

• Power: 10 mW• Precision: ±4.5°• Offset: 12° at process corners• Linearity: < ±4°• Based on thermal diodes and current

mirrors

Page 37: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Types of Sensors

(In approx. order of increasing ease to build)

• Thermocouples – voltage output– Junction between wires of different materials; voltage at

terminals is α Tref – Tjunction

– Often used for external measurements• Thermal diodes – voltage output

– Biased p-n junction; voltage drop for a known current is temperature-dependent

• Biased resistors (thermistors) – voltage output– Voltage drop for a known current is temperature dependent

• You can also think of this as varying R– Example: 1 KΩ metal “snake”

• BiCMOS, CMOS – voltage or current output– Rely on reference voltage or current generated from a

reference band-gap circuit; current-based designs often depend on temp-dependence of threshold

Page 38: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Thermal Sensors in PowerPC

• On-chip temperature sensor (junction temperature)– Based on differential voltage change across

2 diodes of different sizes – Implemented in PowerPC G3/G4 processors

• Instruction Cache Throttling used to dynamically lower junction temperature

Page 39: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Typical Sensor Configuration

PTAT – Proportional to Absolute Temperature

Page 40: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Absolute Sensor 1

Schematics of Delta Vgs Current Reference (left) Generator and Delay Cell (right)

Syal, Lee, Ivanov, Altet, Online Testing Workshop, 2001

Page 41: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Sensors: Problem Issues

• Poor control of CMOS transistor parameters

• Noisy environment– Cross talk– Ground noise– Power supply noise

• These can be reduced by making the sensor larger– This increases power dissipation– But we may want many sensors

Page 42: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

“Reasonable” Values

• Based on conversations with engineers at Sun, Intel, and HP (Alpha)

• Linearity: not a problem for range of temperatures of interest

• Slew rate: < 1 μs– This is the time it takes for the physical

sensing process (e.g., current) to reach equilibrium

• Sensor bandwidth: << 1 MHz, probably 100-200 kHz– This is the sampling rate; 100 kHz = 10 μs– Limited by slew rate but also A/D

• Consider digitization using a counter

Page 43: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

• Mid 1980s: < 0.1° was possible• Precision

– ± 3° is very reasonable– ± 2° is reasonable– ± 1° is feasible but expensive– < ± 1° is really hard

• The limited precision of the G3 sensor seems to have been a design choice involving the digitization

“Reasonable” Values: Precision

P: 10s of mW

Page 44: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Calibration

• Accuracy vs. Precision– Analogous to mean vs. stdev

• Calibration deals with accuracy– The main issue is to reduce inter-die

variations in offset

• Typically requires per-part testing and configuration

• Basic idea: measure offset, store it, then subtract this from dynamic measurements

Page 45: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Dynamic Offset Cancelation

• Rich area of research• Build circuit to continuously, dynamically

detect offset and cancel it

• Typically uses an op-amp

• Has the advantage that it adapts to changing offsets

• Has the disadvantage of more complex circuitry

Page 46: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Role of Precision

• Suppose:– Junction temperature is J– Max variation in sensor is S– Thermal emergency is T

• T = J – S

• Spatial gradients– If sensors cannot be located exactly at

hotspots, measured temperature may be G° lower than true hotspot

• T = J – S – G

Page 47: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Rate of change of temperature

• Our FEM simulations suggest maximum 0.1° in about 25-100 μs

• This is for power density < 1 W/mm2 die thickness between 0.2 and 0.7mm, and contemporary packaging

• This means slew rate is not an issue• But sampling rate is!

Page 48: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Sensors Summary

• Sensor precision cannot be ignored– Reducing operating threshold by 1-2

degrees will affect performance

• Precision of 1° is conceivable but expensive– Maybe reasonable for a single sensor or a

few

• Precision of 2-3° is reasonable even for a moderate number of sensors

• Power and area are probably negligible from the architecture standpoint

• Sampling period <= 10-20 μs

Page 49: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

HotSpot Summary

• HotSpot is a simple, accurate and fast architecture level thermal model for microprocessors

• Over 90 downloads till now• Ongoing active development –

architecture level floorplanning will be available soon

• Download site– http://lava.cs.virginia.edu/HotSpot

• Mailing list– www.cs.virginia.edu/mailman/listinfo/hotspot

Page 50: © Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

© M

irce

a St

an, K

evin

Ska

dron

, Dav

id B

rook

s, 2

002

Temperature-aware computing:Optimize performance subject to a

thermal constraint