Post on 30-Mar-2015
Green High-Performance Computing Centers2011
Supercomputing & Data Center “New” Best Practices
M+W U.S., Inc.A Company of the M+W GroupJuly 2011
2 © M+W GroupM+W U.S., Inc. – Confidential
AGENDA ASHRAE TC 9.9 Guidelines – Whitepaper 2011
Server Fan Speeds and Failure Rates
Chiller-less Data Center Solutions
Supercomputer Leadership in Data Centers & HPC Power Efficiencies
HPC Technologies & Facilities Solutions
GPU/ CPU Computing
New HPC Cooling Systems
Bringing Fluid Directly to the Chip
Using Waste Heat for Practical Purposes
New “Total” Energy Efficiency Concepts
Modular and Container Designs
Renewable Energy Sources
Supercomputer Facilities Examples
PUE’s of 1.05 and Beyond the PUE
Green High Performance Computing Centers
3 © M+W GroupM+W U.S., Inc. – Confidential
ASHRAE TC 9.9
2011 Thermal Guidelines forData Processing Environments
Expanded Data CenterClasses and Usage Guidance
Whitepaper prepared by ASHRAE Technical Committee (TC) 9.9
Mission Critical Facilities, Technology Spaces, and Electronic Equipment
http://tc99ashraetcs.org
ASHRAE 2011 New Thermal Guidelines & Expanded Data Center Classes
© 2011, American Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc. All rights reserved. This publication may not be reproduced in whole or in part; may not be distributed in paper or digital form; and may not be posted in any form on the Internet without ASHRAE’s expressed written permission. Inquires for use should be directed to publisher@ashrae.org.
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
4 © M+W GroupM+W U.S., Inc. – Confidential
Thanks to ASHRAE TC 9.9 for dramatic change in data center HVAC design
“ASHRAE response to industry calling for more energy efficient data center operations”
2011 White paper provides a holistic approach
Considers all variables
Encourages data center owner to weigh risks and rewards
Encourages “next generation” HVAC energy efficiencies
Air and water economizers preferred method of cooling
Chiller-less data centers
Mission Critical Magazine, May/June 2011
ASHRAE 2011 New Thermal Guidelines Response to industry calling for energy efficient data centers
5 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Considering all the Consequences of Pushing the Limits
6 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Change in Temperature & Humidity Operating Envelope over Time
7 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Expanded Data Center Classes – Enterprise & Volume Servers
8 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines & Expanded Data Center Classes
9 © M+W GroupM+W U.S., Inc. – Confidential
ASHRAE 2011 Thermal Guidelines “Allowable” and Recommended Temperatures as a Standard
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
10 © M+W GroupM+W U.S., Inc. – Confidential
ASHRAE clearly states that “Allowable” temperatures intended for design
Insists that “Recommended” temperatures can be exceeded frequently
Enterprise spaces with 90 F supply air
Volume servers up to 113 F supply air
Continuous operations at 113 F in cloud environments
with some consequences
server failure
server fan ramping
ASHRAE 2011 New Thermal Guidelines “Allowable” Temperatures for Design
11 © M+W GroupM+W U.S., Inc. – Confidential
ASHRAE 2011 Thermal Guidelines Server Failure Rates at Allowable Temperatures
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
12 © M+W GroupM+W U.S., Inc. – Confidential
ASHRAE 2011 Thermal Guidelines Server Failure Rates at Actual Ambient Temperatures
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
13 © M+W GroupM+W U.S., Inc. – Confidential
How hot can we go?? … is a only function of server resilience. As technology
improves so will operating temperatures …
Server replacement costs are much lower than costs for original cooling
equipment and ongoing maintenance and electricity bills …
Can you afford to replace less than twice the number of servers you do now in
return for saving the costs of cooling systems and electricity bills?
Failure rates at “actual” outside air temperatures is often less than rates for
“recommended” temperatures used for design
Cost savings of cooling equipment and electricity bills far outweighs
replacement costs of volume servers
How often should we use a chiller? … maybe never at all!
ASHRAE 2011 New Thermal Guidelines “Allowable” Temperatures Used for Design
14 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Chiller Hours Required at US Locations
15 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Chiller Hours Required at International Locations
16 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Server Power Consumption at New Allowable Temperatures
17 © M+W GroupM+W U.S., Inc. – Confidential
ASHRAE 2011 Thermal Guidelines Server Fan Ramping Airflows at New Allowable Temperatures
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
18 © M+W GroupM+W U.S., Inc. – Confidential
HVAC design guidelines
Open rack and server environments – design to remove total heat load
Enclosed aisles and ducted air more sensitive to changes in pressure and flow –
design to meet total server fan demand
May create critical design limitations at higher temperatures
Server fan flow rates can triple between 80 and 113 F
Fans consume much more energy when ramping
Fan laws cube energy consumed
PUE improves with server fan ramping, PUE = IT energy/ Total Energy
False indication of energy efficiency improvement
Supply air passageways need to allow for higher flow with reasonable velocities
air handlers, CRAC’s, ductwork, floor perforations, plenums and racks
lack of air will starve servers
ASHRAE 2011 New Thermal Guidelines “Allowable” Temperatures - Opportunities & Limitations
19 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Sound Levels at New Allowable Temperatures
20 © M+W GroupM+W U.S., Inc. – Confidential
Source: ASHRAE Whitepaper – 2011 Thermal Guidelines for Data Processing Environments
ASHRAE 2011 New Thermal Guidelines Design Process Flow Diagram
21 © M+W GroupM+W U.S., Inc. – Confidential
Conventional open space air cooling is known for reliability … not EE
Implement air flow controls Contain hot-aisle/ cold-aisle, and rack
containment Elevate supply air temperatures per new
high temperature AHSRAE guidelines Free cool with air & water economizers Increase equipment set points Operate with very few hours of chiller
and HVAC, or without chillers altogether Direct and indirect evaporative cooling
increase free hours Air-to-air heat exchangers for air quality Use cheap dry coolers for hottest days
Air Cooled Systems - Enclose/ Elevate Supply Temp/ Free Cool
Economize ~ or Don’t Bother to Contain at All
22 © M+W GroupM+W U.S., Inc. – Confidential
Cooling concepts and Container Trends Airside Economizing
kW per rack
Delta T across the equipment
Controls
Hybrid (Air- and Water-)
Evaporative Cooling
Smaller DX units crunch times
Waterside Economization
Cooling towers
Power
277 / 400V to the rack
Container Changes
Using DC power in the container for distribution
Loosing the case and the fan
Other Proven Data Center Trends and Concepts
23 © M+W GroupM+W U.S., Inc. – Confidential
Powering and Cooling the “Supercomputer” BrainNo more CRAC Units/ No More Waste
Stacking processing chips can make for more compact computers. But it will take advances in “microfluidics” to make such dense number crunching practical.By Avram Bar-Cohen and Karl J. L. Geisler
24 © M+W GroupM+W U.S., Inc. – Confidential
Supercomputers for Internet and Enterprise Computing High speed, high heat, and high efficiency processing
Arrays of processors in single computer
Efficient storage, network and applications technologies
Fault tolerant network architecture Parallel processing and network failover
Model for cloud computing
Fewer dependencies on facilities systems
Better IT Efficiencies and Cloud Performance Automated virtualization with higher and higher capacities
High processing capacity per unit of energy consumed
Low and Constant PUE ~ No longer the focus of data center performance Less need for local back up power
High temperature, chiller-less operations
Energy Efficient Modular Facilities and Controlled Environments
High Quality Waste Heat for Commercial and Industrial Uses More Industrial & Less Commercial Character
The “New” Data Center of the FutureHPC Leading the Way
25 © M+W GroupM+W U.S., Inc. – Confidential
Power consumption and heat dissipation by today's high-end computers have become
important environmental and cost issues to get most value out of the computational
energy consumed.
System engineering methods to ensure processors not in active use automatically idle at
low power
Hardware upgrades and tracking tools to increase system utilization and reliability, while
reducing wasted computer cycles
Application optimization tools and methods for increased efficiency—producing an
increase in computational results with the same, or fewer, resources.
Whole Systems Perspective to plans, designs, and installs cooling and ventilation
systems that also save energy, sustainability through advanced technology development
and systems engineering practices
The HPC Energy Challenge and “IT “Best Practices
26 © M+W GroupM+W U.S., Inc. – Confidential
Argonne's Leadership Computing Facility three times less power than other
installed systems of similar
computing size."
about three times slower than
typical high-end cores.
relationship between speed and
power consumed is not linear, but
exponential.
By scaling back the frequency and voltage of each of the cores, we get a supercomputer that is both green and fast
The IBM Blue Gene/ P and QMinimize HPC Heat Loads by Reducing Compute Speeds
27 © M+W GroupM+W U.S., Inc. – Confidential
Leadership Class Computers (supercomputers) will double in performance every
couple years with the likely paths to increased performance
Exponential growth with ability to perform more computing in less space
Architectures that are more efficient and provide more FLOPS per watt through
tighter integration and architectural enhancements,
Single chips with an unprecedented 100,000 processors and interchip
communications in a 3-D configuration.
reduction of component size, leading to more powerful chips
ability to increase the number of processors, leading to more powerful systems
Keep the physical size of the system compact to keep communication latency
manageable
increase in power density
limitations to remove the waste heat efficiently
Power and cooling will be the crucial factor in the future
Supercomputers – The Future of our Data Centers
28 © M+W GroupM+W U.S., Inc. – Confidential
In the future, 2010 may be known to High Performance Computing as the year of the GPU, Graphics Processing Unit,
Eight of the world's greenest supercomputers combined specialized accelerators like GPUs with CPUs to boost performance
Supercomputers with accelerators are three times more energy efficient than their non-accelerated counterparts on the “Green 500 Computing List”
Supercomputers with accelerators averaged 554 MFLOP’s per watt, while other measured supercomputers without accelerators produced 181 MFLOP’s per watt.
The top three green supercomputers were IBM supercomputers, all in Germany, with operating efficiencies as high as 1,500 MFLOP’s per watt.
More computing power with less power, space and cooling Higher “compute densities” and lower “power densities” … That is IT Efficiency!
Super-efficient Supercomputing – Multiple HPC Processors CPU/ GPU Processing
29 © M+W GroupM+W U.S., Inc. – Confidential
Super Energy Efficient Technologies in the Data Center – DOE EE Award
Slide 29
Multi-processor Performance, Redundancy and Reliability
• Implements redundancy in both hardware and software, with an array of cell-type processors surrounding and support a multi-core central processor.
• At the hardware level, all major subsystems are redundant and hot swappable, including compute cards, disk, network interface cards, power supplies, and fans.
• At the software layer, customers can configure the system to run active/standby software on two separate management cards.
• In the event of a failure, standby software will assume the responsibility of managing the system without any manual intervention.
• Software also manages the internal Terabit fabric and has the intelligence to configure the hardware to route traffic around failure using multiple alternative fabric paths (path redundancy).
• Modular management software provides process isolation and modularity with each major process operating in its own
The SM10000 Uses 1/4 the Power, Space and Cooling
while matching the performance of Today’s Best in Class Volume Servers Requires No Changes to Software
SeaMicro
30 © M+W GroupM+W U.S., Inc. – Confidential
84 cabinet quad-core Cray XT4 with “four core” AMD Opteron processors 200 upgraded Cray XT5 cabinets with “six core” processors 125 gigaflops of peak processing power 362 terabytes of high-speed memory combined system Scalable I/O Network (SION), links them together and to the Spider file system Two thousand trillion calculations per second
ORNL Supercomputers - 2.33 petaflop Cray XT Jaguar
31 © M+W GroupM+W U.S., Inc. – Confidential
XT5 peak power density = 1,750 watts/ sq ft over 4,400 sq ft
ECOphlex™ cooling w/ R-134 high temperature refrigerant conditions inlet air & removes heat
as air enters & exits each cabinet
5% reduction in electricity
480-volt power distribution and power supplies in each cabinet
Minimized distance from main switchboards to computers reduced materials cost
reduced operating costs
ORNL Supercomputers - 2.33 petaflop Cray XT Jaguar
7.7 MW peak power consumption 27.5 kW/rack 50 M kW-h max annual power $5 M annual electrical bill at $0.10/ kW-h
32 © M+W GroupM+W U.S., Inc. – Confidential
University Research Collaborations 27 TF IBM Blue Gene/P System 2048 850Mhz IBM quad core 450d
PowerPC processors
Application Development 80 node Linux cluster with quad-core
2.0GHz AMD Opterons gigabit ethernet network with
infiniband interconnect
Mass Storage Facility Long term storage Tape and disk storage, servers, and
HPSS software
ORNL Supercomputers IBM Blue Gene, Linux, SGI, & AMD, Frost, Lens, Everest
33 © M+W GroupM+W U.S., Inc. – Confidential
Spray Cool phase change cooling with Flourinert mist is thermo-dynamically more efficient than convection cooling with air, resulting in less energy being needed to remove waste heat while at the same time being able to handle a higher heat load.
Air Jet Impingement with nozzles and manifolds for jet impingement is relatively simple.
Liquid Impingement technologies offer higher heat transfer coefficients as a tradeoff for higher design and operation complexity
Advanced HPC Air and Liquid Cooling Systems
34 © M+W GroupM+W U.S., Inc. – Confidential
Liquid cooling with “microchannel” heat sinks and “micropumps” in the processor array
Water-cooled heat sinks and radiators Liquid metal cooling for high-power-
density micro devices. Heat pipes and Thermosyphons exploit
the high latent heat of vaporization Thermoelectric Coolers (TEC) that use
the Peltier-Seebeck effect for hot spots Liquid nitrogen as an extreme coolant
for short-overclocking sessions.
Newest Generation Methods of HPC Cooling
35 © M+W GroupM+W U.S., Inc. – Confidential
Promising architectures with vertical 3D integration of chips
Water removes heat about 4,000 times more efficiently than air
Chip-level cooling with a water temperature of approximately 60°C is sufficient to keep the chip at operating temperatures well below the maximally allowed 85°C.
Use of hair-thin, liquid cooling microchannels measuring only 50 microns in diameter between the active chips are the missing links to achieving high-performance computing with future 3D chip stacks.
Water Cooling 3-D Vertical Integration of HPC Chips
36 © M+W GroupM+W U.S., Inc. – Confidential
3D stacked MPSoC architecture with interlayer microchannel cooling
Bringing Fluid to the Supercomputer Chip
37 © M+W GroupM+W U.S., Inc. – Confidential
Aquasar that will cut energy costs and provide facilities heating.
Microchannel hot water heat sinks, on a server blade from IBM/ETH's Aquasar supercomputer
At 140 to 160 degrees Fahrenheit (60 to 70 degrees Celsius), will keep the computers' components below a performance-hurting 185 degrees Fahrenheit (85 degrees Celsius).
High quality, high temperature heat provides space heating, boiler pre-heat, etc.
“Very Hot Water” HPC Cooling with AquaSar
38 © M+W GroupM+W U.S., Inc. – Confidential
Data Center Rack and Server Heat Exchangers 20kW Air/ 40kW Water/ 100kW Conduction
Slide 38
Water Cooled Overhead Supply & Return
• Rear door cooling rated at 10, 16, 18, 20, 32 & 40 kW
• Adds rack depth and width
• Makes the rack room neutral
• NO fans –server air flow only
• NO electrical connections
• NO aisles required
• Low total fan / pump energy
• High efficiency fans (8-9x efficiency of CW w/EC fan)
• Easy access to rack components
• Piping connections –hard piped, removable, or one-shot
Water Cooled Underfloor Supply & Return
39 © M+W GroupM+W U.S., Inc. – Confidential Slide 39
Combined Air & Water Cooling Flow DiagramSuper-Efficient “High Temperature” Cooling to Each Rack
Open Cell Cooling Towers
Chillers w/Integral Free Cooling HXs
Condenser Water Pumps
Chilled Water Pumps
Racks with Rear Door HX
Isolation Valves between Equipment to Segregate Loop
3-Way Control Valve at Each Rack to provide Individual Rack Temperature Control
Looped Chilled Water Distribution at Roof Level
Cross Tie Branches for Each Row of Racks
14”
*Isolation Valves Will be Installed between Main Equipment on Roof to Segregate Loop
16”
14”
14”
14”
6”
14”
6”
6”
6”
40 © M+W GroupM+W U.S., Inc. – Confidential
M+W Ultra Low PUE Design with 90% Water & 10% Air CoolingCombining Supercomputing & Enterprise Spaces
National Renewable Energy Laboratory (NREL)Golden, CO/ Tier 1/ 10K SF/ 200w-1kWpsf/ Project Cost: $10MM/ Cost/SF: $1,000
• Two level raised floor with larger mechanical distribution subfloor. The main air-handlers were built from an EC fan wall including Merv 13 filters with evaporative cooling.
• Cooling was 90% water & 10% air with thermal storage and a hybrid water / air side economizer. Waffle slab with moveable EC fans in popouts to control air to problem locations.
• Elevated chilled water temp and a normal inlet temperature as a baseline, achieving free cooling +99% of the time (100% with new allowable temperatures).
• Electrical distribution was 480V with all electrical outside except for distribution panels.• Reused waste heat from IT equipment to heat the Admin space (not counted for PUE).
• The High Performance Computing Data Center (HPCDC) is a 10,000 SF, N+1 facility at the leading edge of energy efficiency.
• NREL challenged M+W to provide a facility with an ultra-low (PUE) of no greater than 1.06.
• M+W delivered a design with an independently verifiable PUE of 1.049.
• M+W design meets LEED® Gold.• Facility is design for a power density of 1,000 w/sf.
41 © M+W GroupM+W U.S., Inc. – Confidential
Green and Sustainable Energy SourcesA New Direction for Data Centers
Utility Partial or Full On Site Generation
Cogeneration & Trigeneration
DC for Facilities (Pumps, Fans, Lighting, etc.)
Fuel Cells
Heat Pumps
Bloom Energy
Solar (DC)
Wind (DC)
CHP w/Capstone Turbines
Dual Fuel Source Generation (peak shaving)
42 © M+W GroupM+W U.S., Inc. – Confidential
Efficiencies in Modular & High Performance Computing Centers
QUESTIONS & ANSWERS
Contact:
Bruce Myatt, PEDirector Mission Critical FacilitiesNorth America Design & Design-Build
Tel: 415.609.4242Cell: 415.748.0515bruce.myatt@mwgroup.net
www.usa.mwgroup.net
• Montreal, Quebec, Canada
• Phoenix, Arizona
• Plano, Texas
• Raleigh, North Carolina
• Rio Rancho, New Mexico
• San Francisco, California
• Albany, New York
(Headquarters)
• Chicago, Illinois
• Danbury, Connecticut
• Greenville, South
Carolina
• Guaynabo, Puerto Rico
• Indianapolis, Indiana