Introduction to IBM tools to manage energy consumption · Introduction to IBM tools to manage...
Transcript of Introduction to IBM tools to manage energy consumption · Introduction to IBM tools to manage...
© 2011 IBM Corporation
Les aspects énergétiques du calcul
Introduction to IBM tools to manage energy consumptionFrançois Thomas, Luigi Brochard[ft,luigi.brochard]@fr.ibm.com
Journée Thématique Emergente – EDF Clamart, 13 janvier 2011
© 2011 IBM Corporation2
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Agenda
Why IBM ?
The Power Cycle and Equation
Power7 EnergyScale
Software to Manage Power
Summary
© 2011 IBM Corporation3
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Agenda
Why IBM ?The Power Cycle and Equation
Power7 EnergyScale
Software to Manage Power
Summary
© 2011 IBM Corporation4
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Why IBM ?
Early in the game
Consistently at the top of the Green500 list
Now a strong selling point
© 2011 IBM Corporation5
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Why IBM ?
Early in the game– Blue Gene L design started in 1999-2000– First appearance in Top500 list in 2004– Before the first Green 500 list was out (2005)
Consistently at the top of the Green500 list– Blue Gene P : 2007– Cell Broadband Engine (RoadRunner) + BG/P : 2008-2009– Blue Gene Q : 2010 onwards– SuperMUC : 2011 ?
Our expertise in energy efficicency is now a strong selling point– Purpose built (Blue Gene)– Acceleratedcomputing (RoadRunner, GPU)– COTS hardware (Intel based)
© 2011 IBM Corporation6
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Sources : green500.org and hpcwire.com
© 2010 IBM Germany GmbH
LRZ + IBM Germany
7 Smarter Systems for a Smarter Planet.
Smart System Cooling:Innovative Hot Water usage
First High End HPC System with Hot Water Cooling
Compute Nodes are cooled with hot water
Inlet temperature up to 45°C
Enables All-Year free cooling in Garching
No cooling aggregates (compressors) required
Enables Re-Use of waste heat of system
Heating or Process Energy
Developed in Germany, @ IBM Böblingen Lab
Aquasar Prototype
© 2010 IBM Germany GmbH
LRZ + IBM Germany
8 Smarter Systems for a Smarter Planet.
Smart Job Scheduling:Energy Aware Application Scheduling and System Management
First Implementation of Energy Aware HPC Software Stack on x86
Application Energy consumption will be monitored, stored and reported to the user
For a second application run, the scheduler will decide based on administrative policies
Which Processor Frequency is optimal for the application
Lower Frequency reduces energy consumption
Currently not used system nodes will put to sleep mode or shutdown based on administrator capacity expectations
© 2011 IBM Corporation9
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Agenda
Why IBM ?
The Power Cycle and Equation
Power7 EnergyScale
Software to Manage Power
Summary
© 2011 IBM Corporation10
Journée thématique émergente – EDF Clamart, 13 janvier 2011
UPS
Data Center
GeneratorsN+1
Fuel Oil48 Hrs. Typical
Utility Provider2 Sources
PDU A
PDU BChillers
N+1
Batteries10-15 min
Cooling Towers
Makeup Water
Storage
55F deg water
45F deg water
85F degwater
95F degwater
75F eir
55F deg air
Server
CRAC UnitsRaised FloorStatic Switch B
Static Switch A
Uninterruptible Power Supply
The power cycle : power, compute and cool
© 2011 IBM Corporation11
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Green Datacenter Market Drivers and Trends Increased green consciousness, and rising cost of power
IT demand outpaces technology improvements – Server energy use doubled 2000-2005;
expected to increase15%/year– 15 % power growth per year is not sustainable – Koomey Study: Server use 1.2% of U.S. energy
ICT industries consume 2% ww energy– Carbon dioxide emission like global aviation
Source IDC 2006, Document# 201722, "The impact of Power and Cooling on Datacenter Infrastructure, John Humphreys, Jed Scaramella"
Brouillard, APC, 2006
Future datacenters dominated by energy cost; half energy spent on cooling
Real Actions Needed
© 2011 IBM Corporation12
Journée thématique émergente – EDF Clamart, 13 janvier 2011
How much does it cost?
Ratio of Costs
Acquisition Costs
Energy Costs
Acquisition costs vs Energy costs over 4 years
Ratio of Power
IT Pow er
Cooling Pow er
© 2011 IBM Corporation13
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Our approach is at multiple levels
micro-electronics
– Energy is pervasive in IBM design (especially in our journey to Exascale)
– Long history of energy efficient designs : SOI, SMT, eDRAM, ...
Server and rack level
– Energy management features on all recent IBM servers
– Water cooling :
• rear door heat exchangers (iDataplex)
•« cold plate » (Power6, Power7, BG/Q)
•Hot water cooling (LRZ)
Software level
– Application tuning
– Unified software for power management
– Cluster management
– Power and energy aware job schedulers
Data center level
– Centres of expertise in datacenter design
– Example : the Green Data Center in IBM Montpellier, France
– Another example : hot water cooling at IBM Boeblingen, Germany
– Best practices, monitoring
© 2011 IBM Corporation14
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Power = Capacitance * Voltage2
* Frequency
=> Power ~ Frequency3
=> two cores at 80% frequency consumes as much a one core at 100% frequency.
We have a frequency problem:– Power per chip is constant due to
cooling => multicores at constant frequency
And we have a passive power problem– Smaller lithography => more leakage
current => more idle power
1.0E+05
1.0E+06
1.0E+07
1.0E+08
1.0E+09
1.0E+10
1980 1985 1990 1995 2000 2005 2010
Nu
mb
er o
f T
ran
sist
ors
106
105
108
109
1010
107
1 Million
1 Billion
~50% CAGR
1950 1960 1970 1980 1990 2000 2030
Mo
du
le Heat F
lux (W
/cm2
)
0
2
4
6
8
10
12
14
BipolarCMOS
IntegratedCircuit
JunctionTransistor
2010 2020
3DI
Low-Power
MulticoreThe Power Problem
© 2011 IBM Corporation15
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Passive Power continues to explode
Oxide thickness is near the limit.– Traditional CMOS scaling has ended.– Density improvements will continue but… power
efficiency from technology will only improve very slowly.
– Historic trend of power efficiency improvement will slow
© 2011 IBM Corporation16
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Agenda
Why IBM ?
The Power Cycle and Equation
Power7 EnergyScaleSoftware to Manage Power
Summary
© 2011 IBM Corporation17
Journée thématique émergente – EDF Clamart, 13 janvier 2011
POWER7 Processor IBM’s 45nm SOI process
567 mm2, 1.2B transistors
8 out-of-order cores, 4-way SMT
32KB L1 D/I, 256KB L2 per core, 32MB shared L3 in IBM’s eDRAM process
2 on-chip memory controllers, 2 pairs of buffered memory channels each
Designed for blades, commercial SMPs, supercomputers
4X cores in similar power envelope
Designed for energy-efficiency and effective power management.
© 2011 IBM Corporation18
Journée thématique émergente – EDF Clamart, 13 janvier 2011
44 digital thermal sensors (5 per chiplet, 4 extra-chiplet) on chip; Max chiplet thermal sensor(s) also directly available to firmware.
On-board ambient temperature sensor, memory buffer/DIMM thermal sensors and VRM thermal-trip logic.
On-board measurement circuits and A/D channels for
Thermal, Power and Activity Sensors
Performance/activity sensors– Core-level usage with active cycle counts, instruction throughput counts– Core-level memory hierarchy usage – event-based programmable weight counters for
frequency impact at high loads – Memory controller-level activity – requests and power-mode usage stats
– full system, – processor socket, – memory sub-system, I/O sub-system and fan
power measurements
© 2011 IBM Corporation19
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Rack to Rack: Power 755 Compared to Power 575 (POWER6)
Power 755 Power 575
Cores/chip 8 4
Total cores 32 32
Frequency 3.3 GHz 4.7 GHz
Memory (max) 256 GB 256 GB
Cooling Air Water
Cores/rack Rack type
320
19”
448
24”
Power (Watts)(Linpack) 1650 5400Each Power 755 node offers the same core count as Power 575 with:
40-50% Improvement in Performance
Air Cooling vs. Water Cooling
1/3 of the Energy Consumption
37% Improvement in floor space for a 64 node configuration
Green500 ~ 495 MFlops/Watt
© 2011 IBM Corporation20
Journée thématique émergente – EDF Clamart, 13 janvier 2011
IBM EnergyScale functions
Power / Thermal Trending– Collect and report power consumption, inlet and exhaust temp
Power Capping– Guaranteed (Hard Cap)
• Enforces a power cap via Dynamic Frequency and Voltage Slewing
– Soft Power Cap• Attempted lower cap, but not guaranteed.
Energy Management Modes – Enhanced for P7– Static Power Save (SPS)
• Save power via a fixed voltage and frequency drop – as much as 30% down for P7
– Dynamic Power Save (DPS)• Optimize power vs performance using Dynamic Voltage and Frequency Slewing
• Will provide performance boost at very high utilization
• Will save power at most utilizations
– Dynamic Power Save - Favor Performance (DPS-FP)• Will provide performance boost at most utilizations
• Will save power only at very low utilization
© 2011 IBM Corporation21
Journée thématique émergente – EDF Clamart, 13 janvier 2011
PHYP
TPMD
MemoryFansI/O
Mode 1 & 2A
Mode 2B,3, 4, 5
P-state
Sensor information
(temp, current, performance)
High Level System Power Control View
P7 Chip
Architected Idle Instructions (Doze, Nap,…)
Idle state
Policy and Feedback
Communication interface
© 2011 IBM Corporation22
Journée thématique émergente – EDF Clamart, 13 janvier 2011
System monitoring and management tools
Active Energy Manager
Cooperative Power Management in EnergyScale
POWER7
TPMD
FSP
Hypervisor
Off-chip/On-board
sensors & controls
Operating Systems
Mechanisms access, low-level coordination among
controllers, in-band/out-of-band comm. channel,
autonomous/configurable control engines, sensors.
Real-time power/thermal control, policy-guided, performance-aware energy
saving algorithms
Dynamic resource folding and any explicit low-power mode control
© 2011 IBM Corporation23
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Agenda
Why IBM ?
The Power Cycle and Equation
Power7 EnergyScale
Software to Manage PowerSummary
© 2011 IBM Corporation24
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Some examples
IBM Active Energy Manager (AEM)
– Monitor the power consumption at the node/rack level
– Manages the power consumption (capping, trending, provisioning)
IBM Research tools
– Much higher sampling rates than AEM
– Can separate CPU power, RAM power, other power
– Down to every VRM on a motherboard
Cluster management tool
– Extension to xCAT (eXtreme Cluster Cloud Administration Toolkit)
– To query and set power states
Job Scheduler
– Extension to LoadLeveler
– Power and Energy aware job scheduling function
© 2011 IBM Corporation25
Journée thématique émergente – EDF Clamart, 13 janvier 2011
IBM Systems Director Active Energy Manager (AEM)
AEM is a cornerstone of the IBM energy
management framework
Measure , Monitor, and control energy usage
Power and Thermal Measurement
Supports System x, POWER, and z System natively
Supports other equipment via external sensors
Integrates with Infrastructure Management
Integrates with Enterprise Management
Monitoring energy in a data center lets you begin to manage it
© 2011 IBM Corporation26
Journée thématique émergente – EDF Clamart, 13 janvier 2011
IBM Systems Director Active Energy Manager V4.2
AEM application supported on:
– Windows, AIX, and Linux (x86, POWER, and System z)
Web-based user interface requiring only a browser
Energy thresholding– Enables a user to set an energy or temperature threshold and be notified– when it is reached (or allow an action to automatically be taken)
Soft power capping (an option within power capping)
– Ability to set a lower energy cap value to enable clients to save energy
Easily set power caps on multiple systems
Group capping (an option within power capping):– Enables a user to set an energy cap for a group of servers (such as all the– servers in a rack)
Data to aid in server power on/off scenarios– Understand time to IPL and standby power– Number of lifetime IPLs and reliability threshold (P7 only)
© 2011 IBM Corporation27
Journée thématique émergente – EDF Clamart, 13 janvier 2011
xCAT
Manage power consumption on an ad hoc basis
–For example, while cluster is being installed, or when there is high power consumption in other parts of the lab for a period of time
–Query: Power saving mode, Power capping value, power consumed info, CPU usage, fan speed, environment temperature
–Set: Power saving mode and Power capping value
© 2011 IBM Corporation28
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Power and Energy Aware LoadLeveler Goals
Identify idle nodes in the cluster and put them in the lowest power mode Provide to system admins query capability on historical usage of power
and energy by workload, user, etc. Reduction of energy consumption on workloads with minimal impact to
performance Choices for system admin:
Decide to use Energy Optimize policy or not on his system Decide the max performance degradation one application will be impacted by, if
the Energy policy is applied If Energy Policy is on
policy is applied only to jobs that match the performance degradation criteria
System admin can query LL DB to evaluate the impact of the potential policy on performance
degradation and energy saving
© 2011 IBM Corporation29
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Summary
IBM started early being hurt by working on the energy consumption of its servers.
Energy management is pervasive in IBM servers design, from chips to servers to clusters to datacenters. And even more so with the trend to Exascale.
Good energy management can be a key differentiator in some HPC deals.
We try to tackle the problem at various levels : chip design, system design, cluster management software, job schedulers.
We have monitoring tools that will work across the whole IBM portfolio of servers whatever the microprocessor architecture (IBM or Intel) or the form factor (rackable servers, blades, integrated racks)
Using those tools, our customer can save quite a lot on their energy bill
© 2011 IBM Corporation
Journée thématique émergente – EDF Clamart, 13 janvier 2011
Thank you. Questions ?