Holistic, Energy Efficient Design @ Cardiff
description
Transcript of Holistic, Energy Efficient Design @ Cardiff
Holistic, Energy Efficient Design
@ Cardiff
Dr Hugh Beedie
CTO
ARCCA & INSRV
Going Green Can Save Money
Introduction
The Context Drivers to Go Green Where does all the power go ?
Before the equipment In the equipment
What should we do about it ? What Cardiff University is doing about it
The Context (1)
Cardiff University receives £3M grant to purchase a new supercomputer
A new room is required to house it, with appropriate power, cooling, etc
2 tenders Data Centre construction Supercomputer
The Context (2)
INSRV Sustainability Mission: To minimise CU’s IT environmental impact and to be a leader in delivering sustainable information services.
Some current & recent initiatives: University INSRV Windows XP image default
settings Condor – saving energy, etc compared to a
dedicated supercomputer ARCCA & INSRV new Data Centre PC Power saving project – standby 15 mins after
logout (being implemented this session)
Drivers – Why do Green IT?
Increasing demand for CPU & Storage Lack of Space Lack of Power Increasing energy bills (oil prices doubled) Enhancing the Reputation of Cardiff
University & attracting better students Sustainable IT Because we should (for the Planet )
Congress Report Aug 2007
US Data Centre electricity demand doubled 2000-2006
Trends toward 20kW+ per rack Large scope for efficiency improvement
Obvious – more efficiency at each stage Holistic approach necessary – facility and
component improvements Less obvious – virtualisation (up to 5X)
Where does all the power go? (1)
“Up to 50% is used before getting to the Server”
Report to US Congress Aug 2007
Loss = £50,000 p.a. for every 100kW
supplied to the room
Where does all the power go? (2)
Where does all the power go? (3)
How ? Power Conversion - before it gets to your
room, you lose in the HV-> LV transformer Efficiency=98% not 95%
Return On Investment (ROI)? New installation, ROI = 1 month Replacement, ROI = 1 year Lifetime of investment = 20+ yrs !!!!!
Where does all the power go? (4)
How?
Cooling infrastructure
• Typical markup 75%• Lowest markup 25-30% ?• Est ROI 2-3 years (lifetime 8 years)
Where does all the power go? (4)
How?
Backup power (UPS) (% load vs % efficiency)
Efficiency = 80-95%• Est. ROI for new installation - <1year• Replacement not so good, UPS’ life 3-5 yrs
only ?
Where does it go? – Bull View
Powerdelivery40%
80% 100%
Loads
Cooling
Cum
ulat
ive p
ower
Data Centre consumption
Where does it go? – Intel View
Source: Intel Corp.
7.3%
Voltage Regulators
20W
5.5%
Server fans 15W
7.3%
UPS +PDU 20W
18.2%
PSU 50W
36.4%Load
CPU, Memory,Drives , I/O
100W
25.5%
Room cooling system
70W
Total 275W
Where does it go? – APC View
Chiller 33%
Humidifier 3%
CRAC 9%
IT Equipment 30%
PDU 5%
UPS 18%
Main switchgear / Generator 1%Lighting 1%
Waste
Heat
OUT
Waste
Heat
OUTINDOORDATA
CENTERHEAT
INDOORDATA
CENTERHEAT
Electrical
Power
IN
Electrical
Power
IN
Server Power Consumption
Server Components Power Consumption
PSU losses 38W
Fan 10WCPU 80W
Memory 36W
Disks 12W
Peripheral slots 50W
Motherboard 25W
Options for Cardiff (1)
Carry on as before Dual core HPC solutions
Wait for quad core Saves on Flops/watt Saves on infrastructure (fewer network ports) Saves on management (fewer nodes) Saves on space Saves on power
Options for Cardiff (2)
High density solution Needs specialist cooling over 8kW per rack
Carry on as before (6-8kW per rack) Probable higher TCO
Low density solution (typically) BT – free air cooling Allow wider operating temp range – warranty
issues ? Not applicable here (no space)
What did Cardiff do? (1)
Ran 2 projects HPC equipment Environment
TCO as key evaluation criterion Plus need to measure and report on usage
Problems Finger pointing (strong project mgt) Scheduling (keep everyone in the loop)
Timetable
Tender elements Date of Issue Date of Order Reason fordelay
Room Tender April 2007 July 2007 (to Comtec) No delay
HPC Tender January 2007 December 2007
Waiting forQuadCore
HV Transformer
August 2007 March 2008(Cardiff Estates)
Long lead timeson low loss
transformers
What did Cardiff do? (2)
Bought Bull R422 servers for HPC 80W Quad core Harpertown 2 dual socket, quad core servers in 1U –
common PSU Larger fans (not on CPU)
Other project in same room IBM Bladecentres Some pizza boxes
Back Up Power
Full and half load efficiency 92% Scaleable & Modular – could grow
as we grew Strong environment management
options Integrated with cooling units SNMP Bypass capability
Bought 2 (not fully populated) 1 for compute nodes 1 for management and
another project
APC 160kW Symmetra UPS
Enhanced existing standby generator
Front
InRow Cooling
Unit
InRow Cooling
Unit
Servers
APC Inline RC units
Provides residual cooling to the room
Resilient against loss of an RC unit
Cools hot air without mixing with cold air
Cooling Inside the Room APC Netshelter Airflow
Cooling – Outside the Room
3 Airedale 120kW Chillers
Ultima Compact Free Cool 120kW
Quiet model
Variable speed fans
N+1 arrangement
Free-cooling vs Mechanical cooling
Cooling Load
System operated on 100% Free-Cooling (12% of Year)
System operated on partial Free-Cooling (50% of Year)
System operated on mechanical cooling only(38% of Year)
-7oC3.5oC
12.5oCAmbient Temp.
Mechanical (compressor) coolingFree-cooling
Cost Savings Summary
Low loss transformer £10k p.a.
UPS £20k p.a.
Cooling £50k p.a. estimated
Servers 80W part - £20k p.a. Quad core – same power but twice the ‘grunt’
Lessons learned from SRIF3 HPC Procurement - Summary
Strong project management essential
IT and Estates liaison essential but difficult
Good Supplier relationship essential
Major savings possible Infrastructure (power & cooling) Servers (density and efficiency)