Low-Power Integrated Systems - unibo.it circuits and... · H264 encoding H264 decoding Image...
Transcript of Low-Power Integrated Systems - unibo.it circuits and... · H264 encoding H264 decoding Image...
1
Slide -1 -DEIS Doctoral School 2010
Low-Power Integrated SystemsA HW/SW perspective
Luca Benini DEIS Universita' di Bologna, Italy
Slide -2 -DEIS Doctoral School 2010
Outline
IntroductionSystem-Level Power modeling and estimationDynamic power management
Shutdown-basedVariable voltage devicesImplementation strategies
Conclusions
2
Slide -3 -DEIS Doctoral School 2010
Embedded applications: Requirements
Year of Introduction2005 2007 2009 2011 2013 2015
5 GOPS/W
100GOPS/W
Signrecognition
A/Vstreaming
Adaptiveroute
Collisionavoidance
Autonomousdriving
3D projecteddisplay
HMI by motionGesture detection
Ubiquitousnavigation
Si Xray
Gbit radio
UWB
802.11n
Structured encoding
Structured decoding
3D TV 3D gaming
H264encoding
H264decoding
Imagerecognition
Fully recognition(security)
Autopersonalization
dictation
3D ambientinteraction
LanguageEmotionrecognition
Gesturerecognition
Expressionrecognition
MobileBase-band
1TOPS/W
[Philips/IMEC][DARPA08]
10GOPS/W
0.1-1 TOPS/W embedded platforms by 2015!
Slide -4 -DEIS Doctoral School 2010
1990 1995 2000 2005 2010 2015 2020
100
10-2
10-4
10-6
1
Gate-Oxide Leakage
Sub-Threshold Leakage
Dynamic Power
Pow
er C
onsu
mpt
ion
Power trend
0255075
100125150
Pow
er
Den
sity
(W
atts
/cm
2 )
250nm 180nm 130nm 90nm 65nm
Leakage Pow er
Dynamic Pow er
Power density trend
[STM ASIC]
Technology innovations (e.g. high-k dielectrics)
30%
[Intel, Microsoft and Stanford]
The Era of “Power Limited Scaling”
3
Slide -5 -DEIS Doctoral School 2010
CMOS circuit power consumption components
P = ½ CswVdd ΔV f + IstVdd + IstaticVdd
Dynamic power consumption ( ½ CswVdd ΔV f + IstVdd)Load switching (including parasitic & interconnect)
Glitching
Shoot through power (IstVdd)
Static power consumption (IstaticVdd)Current sources – bias currents
Current dependent logic -- NMOS, pseudo-NMOS, CML
Junction currents
Subthreshold MOS currents
Gate tunneling
Slide -6 -DEIS Doctoral School 2010
Review of Constant Field Scaling
P/AP/APower density
d/α2dDensity
α2PPPower (VI)
αttPropagation time (~CV/I)
αIICurrent
αCCCapacitance
ΕΕField
αVVVoltage
Na/α, Nd/α
Na, Nd
Dopantconcentrations
αL, αW, αTox
L, W, Tox
Dimensions
Scaled Value
ValueParameter
n+S T I S T Ip
n+
T ran s is to rIso la tio n
n +S T I S T Ipn +
T ran s is to rS o u rce
T ran s is to rG ate
T ran s is to rD ra in
C o n ven tio n a l S ilic o n S u b stra te
E lec tro n F lo w
E le c tro n F lo w
A ll F ea tu res R ed u c e in W id th an d T h ickn e ss
S h o rte r D istan ce fo r E lec tro n F lo w P ro d u ce F a ster T ran s is to rs
Scale factor α<1
4
Slide -7 -DEIS Doctoral School 2010
Supply Voltage Trend
With each generation, voltage has decreased 0.85x, not 0.7x for constant field.Thus, energy/device is decreasing by 50% rather than 65%
0
0.5
1
1.5
2
2.5
0 .2 5 m 0 .1 8 m 0 .1 3 m 9 0 n m 6 5 n m 4 5 n m
V d d (V o lts)
Slow declineto 0.7V in 22nm(some think nothingbelow 0.9V)
P = ½ CswVdd ΔV f + IstVdd + IstaticVdd
Slide -8 -DEIS Doctoral School 2010
Active Power Trend
But, number of transistors has been increasing, thus- a net increase in energy consumption,- with freq 2x, active power is increasing by 50%
(src: ITRS ’01-’05)
20406080100120140160Technology
100
150
200
250
300
Pow
er (W
)
Expected HP MP power
ITRS’01
ITRS’05 198 Watts forever!
P = ½ CswVdd ΔV f + IstVdd + IstaticVdd
5
Slide -9 -DEIS Doctoral School 2010
Recent (180nm – 65nm) “Real Scaling”
P/AP/APower density 1 PPPower/chip
0.5 PPPower/device
1.4 FFPerformance
0.7 VVVoltage
1.4 Na, 1.4 Nd
Na, NdDopantconcentrations
0.7 L, 0.7 W, 0.7 Tox
L, W, Tox
Dimensions
Scaled Value
ValueParameter
0.9 V
2.0 F
1.0 P
2.0 P/A
1.5 P
Slide -10 -DEIS Doctoral School 2010
65nm – 22nm “Projected Scaling”
P/AP/APower density 1 PPPower/chip
0.5 PPPower/device
1.4 FFPerformance
0.7 VVVoltage
1.4 Na, 1.4 Nd
Na, NdDopantconcentrations
0.7 L, 0.7 W, 0.7 Tox
L, W, Tox
Dimensions
Scaled Value
ValueParameter
0.9 V
0.8 P
1.2 P/A
1.2 P198 Wattsforever!?How?
6
Slide -11 -DEIS Doctoral School 2010
Active-Power Reduction Techniques
P = ½ CswVdd ΔV f + IstVdd + IstaticVddActive power can be reduced through:
− Capacitance minimization− Power/Performance in sizing
− Clock-gating
− Glitch suppression
− Hardware-accelerators
− System-on-a-chip integration
− Voltage minimization− (Dynamic) voltage-scaling
− Low swing signaling
− SOC/Accelerators
− Frequency minimization− (Dynamic) frequency-scaling
− SOC/Accelerators
Slide -12 -DEIS Doctoral School 2010
Static Power
P = CswVdd ΔV f + IstVdd + IstaticVdd
Static energy consumption (IstaticVdd)Current sources – even uA bias currents can add up.
NMOS, pseudo-NMOS – not commonly used
CMOS CML logic – significant power for specialized use.
Junction currents
Subthreshold MOS currents
Gate tunneling
7
Slide -13 -DEIS Doctoral School 2010
Passive Power Continues to Explode
Pow
er D
ensi
ty (W
/cm
2 )
0.010.110.001
0.01
0.1
1
10
100
1000
Gate Length (microns)
Active Power
Passive Power
1994 2005
Gate Leakage
Leakage is the price we pay for the increasing device performance
Fit of published activeand subthreshold CMOSdevice leakagedensities
Src: Nowak, et al
Slide -14 -DEIS Doctoral School 2010
Standby-Power Reduction Techniques
Standby power can be reduced through:− Capacitance minimization
− Voltage-scaling
− Power gating
− Vdd/Vt selection
8
Slide -15 -DEIS Doctoral School 2010
Where Does the Power Go?
issue queuesreg filesicache/itlbdcache/dtlbL2 cacheFUsresult busesclockother
Power profile (dynamic power) of a 4-way superscalar microprocessor
Bottom line: power needs to be reduced across-the-board
Slide -16 -DEIS Doctoral School 2010
Need to consider CPU & System Power
CPU Dominates Thermal Design Power
Mobile PCThermal Design (TDP) System Power
Note: Based on Actual Measurements
600/500 MHz uP37%
LCD 10"19%
HDD9%
Memory+Graphics12%
Power Supply10%
Other13%
Mobile PCAverage System Power
600/500 MHz uP13%
LCD 10"30%
HDD19%
Memory+Graphics15%
Power Supply10%
Other13%
Multiple Platform Components Comprise
Average Power[Courtesy: N. Dutt; Source: V. Tiwari]
9
Slide -17 -DEIS Doctoral School 2010
Cost metrics
POWERP(t)=I(t)V(t)
Average power T-1∫TPdtPeak power MaxT(P)
PERFORMANCELatency vs. throughput
Worst Case vs. Average Case~T-1
Never considered in isolation
Compound Cost metricC=PTα
α>1
Performance constraintsMin{P}S.t. T<Tmax
Slide -18 -DEIS Doctoral School 2010
Battery Properties
Energy constrained systems do not alwaystarget energy minimizationThe charge drawn from a battery does notdepends only from capacity (energy budget) but also from discharge rateGoal is lifetime maximization
10
Slide -19 -DEIS Doctoral School 2010
Optimization for low-energy always the same as optimization for high performance?
int a[1000];c = a;for (i = 1; i < 100; i++) { b += *c; b += *(c+7); c += 1;}
LDR r3, [r2, #0]ADD r3,r0,r3MOV r0,#28LDR r0, [r2, r0]ADD r0,r3,r0ADD r2,r2,#4ADD r1,r1,#1CMP r1,#100BLT LL3
ADD r3,r0,r2MOV r0,#28MOV r2,r12MOV r12,r11MOV r11,rr10MOV r0,r9MOV r9,r8MOV r8,r1LDR r1, [r4, r0]ADD r0,r3,r1ADD r4,r4,#4ADD r5,r5,#1CMP r5,#100BLT LL3
2231 cycles16.47 µJ
2096 cycles19.92 µJ
No !• High-performance if available memory bandwidth fully used;low-energy consumption if memories are at stand-by mode
• Reduced energy if more values are kept in registers
Slide -20 -DEIS Doctoral School 2010
Outline
IntroductionSystem-Level Power modeling and estimationDynamic power management
Shutdown-basedVariable voltage devicesImplementation strategies
Conclusions
11
Slide -21 -DEIS Doctoral School 2010
Impact of software
For a given a hardware platform, the energy to realize a function depends on software
Operating systemDifferent algorithms to embody a function (e.g., sorting)Different coding stylesApplication software compilation
Slide -22 -DEIS Doctoral School 2010
Estimation of SW Power
SW consumes power on the hardwareSystem power models
Constant additive model (spreadsheet)Power state machines (abstract event simulator)Instruction-level (ISS)
Tradeoff: functional accuracy vs. speed
12
Slide -23 -DEIS Doctoral School 2010
The spreadsheet model
Constant power dissipation for each componentTotal power consumption by summing contributions
General-purpose systemsBackward compatibilityComponent-based
Spreadsheet-based analysisBasic budgetingSimple “what if” analysesNo learning curve
Slide -24 -DEIS Doctoral School 2010
Example: spreadsheet analysis
PDA #Comp Vdd Iidle Ion %on %idle I(mA)
Proc 1 3.3 0.5 50 0.7 0.3 36.15DRAM 1 3.3 0.1 12 0.7 0.3 8.43FLASH 5 3.3 0.0 9 0.7 0.3 31.5IR 1 3.3 0.0 64 0.05 0.95 3.2RTC 1 3.3 0.0 0.1 1 0 0.1DC-DC 1 - 0.1 5.5 0.99 0.01 5.44
TOT 83.82
13
Slide -25 -DEIS Doctoral School 2010
Limitations
The estimation is mainly left to the designerWorkload estimation is not a straightforwardtaskNeed a more complex high level system model
Slide -26 -DEIS Doctoral School 2010
Power State Machines: System Model
Event-driven model (resources & events)
Key feature: No overhead for long inactivity (no events).
Resource
Resource
Resource
Resource
PowerManager
DC-DCConverter
Battery SystemRequests
Requests
Requests
User
User
User
Environment
14
Slide -27 -DEIS Doctoral School 2010
Key Features
Use for component selection and partitioning phasesAbstract away all but power behavior of the systemThe model includes information about power behavior of block, block interactions, info aboutenvironment which drives the behaviorEach reaource is a power state machinePower manager translates environment stimuli tostate changes of system resourcesFaster than Costant Additive Model
Slide -28 -DEIS Doctoral School 2010
Power State Machines: Resource Model
Example of PFSM: LCD display unit.
BACKLIT150mW
DISPLAY50mW
OFF0mW
0.5msec
0.1msec
0.1msec
10msec
0.1msec
11msec
Key features:Power associated with statesTransitions have a cost
15
Slide -29 -DEIS Doctoral School 2010
Power State Machines: Additional Components
Workload:User/Environment:Non-deterministic FSM (models the non-deterministic nature of the requests).
Power supply sub-systemBatteryDC-DC converter
Slide -30 -DEIS Doctoral School 2010
Functional Power Models
Objective: Estimate the power dissipated by a specificfragment of codeNeeds to track instruction executionMust be fast (millions of instructions)
RTL or Gate-level are not fast enoughNeeds to model processor & memorysystem
16
Slide -31 -DEIS Doctoral School 2010
Software Power Estimation: Instruction-Level
ILPA [TMWL96]Empirical method for characterizing single (or very short sequences of) instructions.Key issues:
Evaluation of power dissipation for single instructions.Choice of representative instructions forcharacterization.
Advantage: Roughly architecture-independent.
Slide -32 -DEIS Doctoral School 2010
Instruction-Level Power Characterization
Direct measurement of the currents drawn fromthe power supply while executing the instructions.HDL simulation:
The instructions are simulated on a processor model in some HDL.The processor is plugged into a tester machine and simulation traces are applied. The current is measuredby the tester.
Use simulation of a gate-level description of the processor.
17
Slide -33 -DEIS Doctoral School 2010
Instruction-Level Models
A power cost is assigned to each instruction.Two components of the cost:
Static component, called ``base-cost'': It is the individual instruction cost without a notion of ``state''.Dynamic component, called ``circuit state effects'': Itaccounts for the previous processor state.
Dynamic cost accounts for events depending on sequences of events (e.g., cache misses, pipeline stalls).
Slide -34 -DEIS Doctoral School 2010
Extracting the model
The base cost is computed as follows:An infinite loop containing a total of N copies of the target instruction I is executed.The average current is measured as described earlier.The power cost is obtained from the values of the current, the supply voltage and the cycle/instruction.
N should not be too small to amortize the loopoverhead.
18
Slide -35 -DEIS Doctoral School 2010
Computing program execution cost
Due to the averaging process, the costs for I1 → I2 and I2 → I1 cannot be distinguished.The cost of a program can be summarized as follows:
Cost(Program) = Σi (B i · N i) + Σi j (O i j ·N i j ) + Σ k E k
where: B i : Base cost of instruction i.N i : # of occurrences of instruction i.O i j : Dynamic cost of sequence →j.N i j : # of occurrences of sequence →j.E k: Other effects, obtained from program profiling.
Slide -36 -DEIS Doctoral School 2010
Instruction-Level power model: example
Example of power cost values (expressed in pJ):
Example of computation:
Total value = 5.87pJ/(3·25ns) = 78.26μW (Tc = 25ns)
LOADDLOADADDMULT
2.37 0.17 1.19 0.920.99 0.26 0.531.19 0.66
InstructionName
BaseCost
Circuit State EffectsLOAD DLOAD ADD MULT
1.98 0.13 0.15 1.19 0.92
Total
EvaluationProgram(initial state is ADD) Base Cost Circuit StateDLOAD A←x, B ←y LOAD C←z ADD A←C, B
2.37 1.191.98 0.150.99 1.193.34 2.53
19
Slide -37 -DEIS Doctoral School 2010
Micro-architectural Power Model
The processor is viewed as an interconnection of macro blocks
E.g. Execution units, register file, etc.Power models are built for the macros
E.g. Analytical, look-up tables, etc.Advantage: allows micro-architecture expl.Disadvantage: no black-box for COTS proc.
Slide -38 -DEIS Doctoral School 2010
FPLA : Functional Level Power Analysis
Between ILPA and micro-architecturalLess parameters than ILPA, less info on intenals than micro-acrchitectural
Suitable for complex cores, with limited internal informationAlgorithmic parameters require functional simulation (ISS run or code analysis)
Algorithmic parameters• α: parallelism rate• β: processing rate• γ: ext. IM access rate• ε: DMA activity rate• τ: ext. DM access rate
Architectural parameters• F: clock frequency• MM: internal Mem mode
(mapped,bypass,cache,freeze)• DD: data mapping• DW: DMA data width
[Laurent03]
(example TI62, TI67 DSPs)
20
Slide -39 -DEIS Doctoral School 2010
Integrating functional and power models
Estimating together HW and SW power consumption is more effective than consideringthe two contributions separately. This is because the power consumption of a task mapped onto software is not independent of the implementation of the remaining tasks.Two approaches:
Non-interacting (trace-based) HW/SW estimation.Concurrent HW/SW estimation.
Slide -40 -DEIS Doctoral School 2010
Non-Interacting HW/SW Power Estimation
Avalanche [LH98]Target system architecture:
Power estimation of custom HW done separately(constant power in the model).Focus on power dissipation of SW and memory hierarchy.
CPUSparcLite
Custom HW(ASICs)
MainMemory
I-CacheD-Cache
21
Slide -41 -DEIS Doctoral School 2010
Trace-based Estimator Architecture
Block diagram:
Main feature: Exploitation of detailed software, memory, and cache energy models.Main limitation: No interaction between SW and HW during the estimation.
Behavioral- LevelSimulator
Mermory TraceProfiler
ApplicationProgram
Software Energy Model
Dinero III
ProgramExecutionTrace
MamoryAccessTrace
CPUenergy
MainMemoryEnergy
CacheEnergy
Main MemoryEnergy Model
CacheEnergy Model
Total System Energy
Slide -42 -DEIS Doctoral School 2010
Concurrent HW/SW Power estimation
IF ID EX MEM WBInstruction set
simulator
Microarchitectureunits utilization interface
Addr/Data stream interface
Icache Dcache
Main MemoryExternalpowermodels
Peripherals
Processor unitsProcessor unitsmemory modelsmemory models
Processorpowermodels
E.g.: Simplescalar/Wattch
22
Slide -43 -DEIS Doctoral School 2010
State of the art: MPARM
INTERCONNECTION
Core Core INTERRUPTCONTROLLER
PRI MEM 4 SHARED MEM SEMAPHORES
Core Core
PRI MEM 3PRI MEM 2PRI MEM 1
STbus or AMBA or Xpipes
Simulation is cycle accurate(~ 24 Kcycles/sec with 4 cores on a 2-proc Pentium III, 1GHz, 512MB)
Slide -44 -DEIS Doctoral School 2010
Power modeling
Invoked from hardware modules after activation events on a cycle-by-cycle basisEnergy info is passed to data collectorroutine at each cycle
MEMORY(or CACHE)
MODULE
PowerModelEnergy spent
DataCollector
Memory state1. The module calls the
power model function
Energy spent2. The module sends the
energy consumption info to the data collectorroutines
23
Slide -45 -DEIS Doctoral School 2010
Power model for ARM core
Power statistics for the ARM core are collected in a different wayNeed to account for idle power when ARM module is stalled (ISS not invoked)
ARMMODULE
PowerModel
Energy spent
DataCollector
1. The ISS calls the data coll. routine
Core state
2. The data collectorroutine gets the energyinformation fromthe power model
Slide -46 -DEIS Doctoral School 2010
Using power models
ISS core SWI_METRIC_START
Initialization:...RegisterSWI(SWI_METRIC_START,metric_start_swi_call);...
installs the handleruint32_t metric_start_swi_call(
CArmProc *arm, uint32_t r0, uint32_t r1, uint32_t r2, uint32_t r3)
{statobject->startMeasuring(arm->ID);return r0;
}......__asm ("swi " SWI_METRIC_STARTstr);......
Program:
handler invocation
• The handler can be easily modified to be invoked by a pseudo-hardware module for collection of system power statistics
24
Slide -47 -DEIS Doctoral School 2010
Power profiling
Waveforms: cycle by cycle consumption
Power estimation----------------
Energy spent:ARM 0
core: 25609147.30 [pJ]cache: 105048808.17 [pJ]
ARM 1core: 25609092.30 [pJ]cache: 105048808.17 [pJ]
ARM 2core: 25609092.30 [pJ]cache: 105048808.17 [pJ]
ARM 3core: 25614207.30 [pJ]cache: 105048808.17 [pJ]
RAM 0: 2825183.87 [pJ]RAM 1: 2825183.87 [pJ]RAM 2: 2825183.87 [pJ]RAM 3: 2824958.26 [pJ]RAM 4: 0.00 [pJ]BUS: 50778876.39 [pJ]
Power spent:ARM 0
core: 51.18 [mW]cache: 209.95 [mW]
ARM 1core: 51.18 [mW]cache: 209.95 [mW]
ARM 2core: 51.18 [mW]cache: 209.95 [mW]
ARM 3core: 51.18 [mW]cache: 209.95 [mW]
RAM 0: 5.65 [mW]RAM 1: 5.65 [mW]RAM 2: 5.65 [mW]RAM 3: 5.65 [mW]RAM 4: 0.00 [mW]BUS: 101.49 [mW]
Output file: totals
Slide -48 -DEIS Doctoral School 2010
Energy characterization of communication primitives
Power distributions for send Power distributions for receive
Message size:128 byte
Message size:256 byte
25
Slide -49 -DEIS Doctoral School 2010
DVFS Model
Performance :If fCK1= k * fCK2 (k>1)
k CPU1 # sim cycle 1 CPU2 # sim cycle
TaccCPU1 L1 = k * TaccCPU2->L1
TaccL2, TaccDRAM = cost
DVFS model : Simulation snap-shot Simics& RubySimics& Ruby
L2
CPU1L1
CPU NL1
L2
DRAM
CPU2L1
Network
fL2
f1 = k1 * fnom f2 = k2 * fnom fN = kN * fnom
fDRAM
( )αtdd
ddfg VV
VL=
f=T
−
⋅1t
dd
Vf
V Nominalvalue
fLProp. Const.
ft LVf ,, ddV
run-timeselected freq
associated voltage supply
MODEL INIZIALIZZATION
RUN TIME
Slide -50 -DEIS Doctoral School 2010
Power ModelPower model interface
Simulation snap-shot
L2
CPU1L1
CPU NL1
L2
DRAM
CPU2L1
Network
f1 = k1 * fnom f2 = k2 * fnom fN = kN * fnom
i-th CPU# Cycle Active# Cycle Stall# Cycle Idle# Cycle PGi-th L1# Line & WD Read# Line & WD Write
TVf dd ,,
i-th L2
# Line Read# Line Write
DRAM
# Burst Read# Burst Write
On a sampling window of 1.3us
26
Slide -51 -DEIS Doctoral School 2010
Power Model
sta-Activespgatingsta-PowerG
-Activestasista-Idle
sta-Activesssta-Stall
dyn-Activedpgatingdyn-PowerG
dyn-Activedidyn-Idle
dyn-Activedsdyn-Stall
t
ddActivelkg
dddActivedyn
PK PPK P
P K P
P K P
PK P
P K P
TKVq
TVZ=P
fVK=P
⋅=
⋅=⋅=
⋅=
⋅=
⋅=
⋅⋅−
⋅⋅⋅
⋅⋅
−
−
e2
2
tdd
gPowerGatindyngPowerGatinsta
IdledynIdlesta
StalldynStallsta
ActivedynActivesta
VTVf
PP
PP
PP
PP
,,,
,
,
,
,
−−
−−
−−
−−
CPU nominal value – power per cycle
spgsiss
dpgdids
d
KKK
KKKZK
,,
,,,
Proportional constants per CPU
tdd VTVf ,,,run-time operating conditions
Power per cycle @ specific operating condition
MODEL INIZIALIZZATION
RUN TIME
spgsiss
dpgdids
d
KKK
KKKZK
,,
,,,
Power model equations
i-th CPU
PGdynPGsta
IdledynIdlesta
StalldynStallsta
ActivedynActivesta
PP
PP
PP
PP
−−
−−
−−
−−
,
,
,
,
Slide -52 -DEIS Doctoral School 2010
PCPU1PL1
PCPUnPL1
PCPU2PL1
L2L2
Network
Thermal Model
Power to Thermal interface
si sisi
sisi
sisi
sisi
Cu cucu cu cu
Heat spreaderIC package
Package pin
PCB
IC die
Termal ModelTi
27
Slide -53 -DEIS Doctoral School 2010
Reliability Model
Aging and critical path delay:Facelift : Hiding and Slowing Down Aging in Multicores. A.Tiwari , J.Torrellas
( )( ) 0,250e stress
a
ox
tdd
tddoxoxNBTItstress tTKE
EtVV
VVCtA=ΔV ⋅⋅−
⋅−
⋅−⋅⋅⋅
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛ ⋅−⋅
stressrecovery
recoverytstresstrecovery t+t
tηΔV=ΔV 1
tEtVV
fαA=ΔV ox
tdd
HCIt ⋅⋅−
⋅⋅⋅ 1e
Δvt_stress NBTI
Δvt_recovery NBTI
Δvt_stress HCI
From DVFS model
from thermalmodel
from CPU usage Reliability Model
to powermodel
to DVFSmodel
Slide -54 -DEIS Doctoral School 2010
Simulator Performance
Host:Intel pentium core 2 duo 2.4 Ghz2GB RAM
Simics + Ruby:
Simics + Ruby + DVFS:
Simics + Ruby + DVFS + Power:
Simics + Ruby +DVFS + Power + Thermal interface:
Simics + Ruby +DVFS +Power +Thermal Model:
Target:4 core pentium 4 2GB RAM32 KB private L1 cache4 MB shared L2 cache
Tsim = 1040 s
Tsim = 1045 s
Tsim = 1110 s
Tsim = 1160 s
Tsim = 1240 s
68 cellsT = 100ns
Compute every 13us
1 Billion instruction ~ 0.5 sec virtual time