OK Labs - Virtualization as the Nexus of Multicore Power Management
-
Upload
open-kernel-labs -
Category
Technology
-
view
831 -
download
3
description
Transcript of OK Labs - Virtualization as the Nexus of Multicore Power Management
![Page 1: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/1.jpg)
November 9-11, 2010The Santa Clara Convention Center
www.armtechcon.com
![Page 3: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/3.jpg)
Energy-management Virtualization basics Enter multicore Summary
Overview
![Page 4: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/4.jpg)
Device uses energy• Drains battery
Goal of energy management:• Maximize battery life
Energy in Mobile Devices
![Page 5: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/5.jpg)
Dynamic voltage and frequency scaling
CMOS power consumption:• P = Pdyn + Pstat
• Pdyn ∝ f V2
• Vmin ∝ f (very approximately)
Assuming execution time T 1 / ∝ f• Edyn = Pdyn T ∝ f V2 / f = V2 = f2
• lower frequency lower dynamic energy⇒
Energy-Management Mechanisms: DVFS
![Page 6: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/6.jpg)
When CPU is idle, turn clock off• Pdyn = 0 ⇒ P = Pstat
Sleep states reduce power further:• Psleep < Pstat
Typically have multiple sleep states• shallow sleep states save some energy
but fast to enter/exit
• deep sleep states save more energy but lose state and are expensive to enter/exit
Complex tradeoff
Mechanisms: Sleep States
![Page 7: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/7.jpg)
Edyn ∝ f 2 lowest frequency is best⇒ Ignores static energy!
• E = Edyn + Estat
• Edyn ∝ f 2
• Estat = Pstat T ∝ 1/f
Low f increases execution time ⇒ Estat increases at low f !
Popular Approach: Lowest Frequency
![Page 8: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/8.jpg)
Run at maximum f, then go to sleep• Tries to minimize static power — but:
• dynamic power isn’t irrelevant (yet)– T 1/∝ f isn’t correct either — ignores memory!
• Effect of memory stalls• T = TCPU + Tmem
• TCPU ∝ 1/f • Tmem = const• Estat ∝ T = 1/f + const
Ignores sleep energy!
Other Approach: “Race to Halt”
![Page 9: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/9.jpg)
Run at maximum f, then go to sleep Earlier completion longer sleep⇒
• E = Edyn + Estat + Esleep
• Esleep = Psleep Tsleep
• Tsleep = T0 – T
• Esleep = Psleep (T0 - T)
Still ignores dynamic energy!
Other Approach: “Race to Halt” (2)
![Page 11: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/11.jpg)
Real Data: Total Energy (Measured)
CPU-boundCPU-bound
Memory-bound
Memory-bound Naïve
modelNaïvemodel
![Page 12: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/12.jpg)
Real Data: Including Sleep Energy
High-powersleep stateHigh-powersleep state
Low-powersleep stateLow-powersleep state
![Page 13: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/13.jpg)
Energy management is complex! Optimal setting depends on:
• Workload memory-bound vs CPU-bound vs in-between
• Hardware platform static vs dynamic energy CPU vs memory power depth of sleep states and cost of entering
Simple models don’t work!
Summary: Energy-Management Basics
![Page 14: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/14.jpg)
How to establish memory-boundedness? Easy way out: pre-characterization
• measure behavior off-line
• determine optimal power setting by model or trial-and-error
Ok-ish for pre-defined workloads Unsuitable for open systems
• ... such as phones
Tricky with apps which change behavior
Characterizing Workloads
![Page 15: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/15.jpg)
Need to observe app and adjust setting• works for any app
• adjusts to changing behavior
Solution by [Snowdon et al., EuroSys’09] Performance counters are your friends!
• e.g. cache misses indicate memory access
Can systematically select best counters• build model of platform
• Linear combination of performance-counter readings
• pre-characterize hardware
• pick counters which provide most accurate model
• using sound statistical methods
Better Way: On-Line Characterization
![Page 16: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/16.jpg)
Model predicts energy consumption and relative execution speed• at present setpoint
• at different setpoins
Accurately predicts energy- and performance response to DVFS• within a few %
Can use this for informed energy-management decisions
On-Line Characterization & Modeling
![Page 19: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/19.jpg)
What is “best”?• Maximal Performance?
• Minimal Energy?
• Minimal Power?
Depends... May change
• battery depletes
Need flexible policies
Energy Management Policies
Workload PredictionWorkload Prediction
CandidateSetpoints
QoS Info
Setting
Energy/Performance Energy/Performance ModelsModels
Selection PolicySelection Policy
Workload Statistics
![Page 21: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/21.jpg)
Generalized Energy-Delay Policy
PerformancePerformance
CPU-boundCPU-bound
Memory-bound
Memory-bound
EnergyEnergy
![Page 23: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/23.jpg)
Implementation of power model and policies• once for platform vs once for each guest
• no guest has global view, hypervisor does
• integration with other cores DSPs, baseband processor
• policy-mechanism separation
Why do it outside the OS?
![Page 24: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/24.jpg)
Controls all resources• CPU, memory, devices
De-privileged guest OSes• execute in user mode
• prevents interference with hypervisor with other guests
• ensures hypervisor retains control over resources
The Hypervisor
![Page 25: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/25.jpg)
Subsystems compete for it Cannot let subsystems manage it
• just as with memory, CPU
Needs trusted, central authority Needs to be done in virtualization layer
Energy is a Global Resource
![Page 26: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/26.jpg)
Mechanisms in hypervisor Policies in isolated management module Keep hypervisor policy-free
• HW-like
Policy-Mechanism Separation
![Page 27: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/27.jpg)
Additional degree of freedom• DVFS + sleep states + core shutdown
• Hypervisor supports transparent, temporaryconsolidation of cores
• Unneeded cores turned off to reduce power
Different tradeoffs• Performance vs power close to linear
Important to manage cores globally• In average more cores off than with
per-guest management• Can use deeper sleep state
• Less overall energy use
Enter Multicore
OKL4 Microvisor
Subsystem #1
CPU
VCPU VCPU VCPUVCPU
Subsystem #2
CPU CPUCPU
OKL4 Microvisor
Subsystem #1
CPU
VCPU VCPU VCPUVCPU
Subsystem #2
CPU CPUCPU
![Page 28: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/28.jpg)
Cache coherency couples clock frequencies of multiple cores
OSes running on different cores cannot adjust clock independently
Requires entity with global view
Enter Multicore: Architectural Constraints
![Page 29: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/29.jpg)
Cores have same ISA but different clock rates Hypervisor can determine optimal mapping of subsystems to cores
• Using same infrastructure as for DVFS
• Integrate with temporary core consolidation
Asymmetric Multicore
FastCPU
SlowCPU
OKL4 Microvisor
CPU-boundSubsystem
FastCPU
VCPU VCPU VCPUVCPU
Memory-boundSubsystem
SlowCPU
![Page 30: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/30.jpg)
Move subsystems between cores• including temporary consolidation
of different subsystems on common core
Architectural inter-core dependencies• cannot manage core clocks independently
Requires global control• ... outside individual OSes
• indirection layer between OS and hardware
No practical alternative to virtualization!
The Future is Multicore
OKL4 Microvisor
Subsystem #1
CPU
VCPU VCPU VCPUVCPU
Subsystem #2
CPU CPUCPU
![Page 31: OK Labs - Virtualization as the Nexus of Multicore Power Management](https://reader033.fdocuments.us/reader033/viewer/2022061209/548cf6b2b47959145c8b4665/html5/thumbnails/31.jpg)
Virtualization is unavoidable long-term ... but provides other benefits short-term Early uptake maximises benefits Future-proof your designs!
Summary