HCLT Whitepaper: Thermal Design and Management of Servers

14
Thermal Design and Management of Servers February 2010

description

In today’s digital age of rapid knowledge development, an enormous amount of information is being generated every day across the world. This data needs to be stored, processed and secured so the user can access this data quickly. Servers play a major role in this type of data-intensive business applications. The advancements in hardware, software and miniaturization technologies, along with the information evolution, has led to a vast increase in servers power densities and computing power. To improve the reliability and to enhance performance, thermal management needs to be performed in servers by removing the heat generated by the devices. This paper focuses on the role of thermal management of servers in data centers and green data centers. It also investigates the challenges faced in thermal design and management of servers. The emerging cooling technologies which have evolved over the years in the server industry will be discussed. Case studies on thermal management of servers will be presented

Transcript of HCLT Whitepaper: Thermal Design and Management of Servers

Page 1: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers

February 2010

Page 2: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Abstract

In today’s digital age of rapid knowledge development, an enormous amount of information is being generated every day across the world. This data needs to be stored, processed and secured so the user can access this data quickly. Servers play a major role in this type of data-intensive business applications. The advancements in hardware, software and miniaturization technologies, along with the information evolution, has led to a vast increase in servers power densities and computing power. To improve the reliability and to enhance performance, thermal management needs to be performed in servers by removing the heat generated by the devices.

This paper focuses on the role of thermal management of servers in data centers and green data centers. It also investigates the challenges faced in thermal design and management of servers. The emerging cooling technologies which have evolved over the years in the server industry will be discussed. Case studies on thermal management of servers will be presented.

ContentsAbSTRACT 2

InTRoDuCTIon 3

RoLe oF SeRveRS In DATA CenTeRS 5

THeRMAL CHALLengeS In SeRveRS 6

InnovATIve CooLIng SoLuTIonS 6

HCL CASe STuDIeS 8

ConCLuSIon 12

ReFeRenCeS 12

ACRonyMS 13

AuTHoRS 13

AbouT HCL 14

Page 3: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Introduction

Any computer or device providing services can technically be called a server. In the hardware sense, server means a computer model intended for running software applications under the heavy demand of a network environment. With the evolution of the internet, the amount of information exchange with the server is vast. In world wide web applications, servers are playing a major role in helping the data reach the user in fractions of seconds. A typical server consists of multi-core CPUs, DIMMs, hard drives, power supply units, network connections, etc. The servers were classified based on their applications and are shown in Fig. 1.

As information increased every day, server capabilities increased in line with demanding business applications. IDC[3] estimates that server system density has increased by 15% annually over the last 10 years as organizations have moved from pedestal servers to rack-optimized systems, and now to extensive implementation of blade servers (Fig. 2).

General based

Servers

Application based

Print Server

Game ServerDNS Server Database server Web Server

Communication Server

Enterprise application serverModemPersonal

Computer

Fig. 1: Servers classification

Fig. 2: Worldwide server installed base by form factor, 1996-2010 (Source: IDC, 2006)

This shift toward smaller form factors has increased the demands on power and cooling management at the rack level. While the average power consumption per rack in the year 2000 was 1kW, datacenter managers today must account for 6.8kW per rack, and

Page 4: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

must plan to manage over 20kW in the next five years. The trend toward high density has resulted in hot spots within the datacenter that are subject to failures and reliability concerns.

As servers became more powerful and compact, their numbers increased with the information evolution. Figure 3[3] shows the worldwide server installed base, new server spending and power, and cooling expenses. One interesting fact can be observed from Fig. 3, i.e. power and cooling expenses were approaching the new server spending. This means the power and cooling expenses were going to outweigh the cost of the hardware and software in the future.

Fig. 3: Worldwide server installed base, new server spending , and power and cooling expense (Source: IDC, 2006)

Fig. 4 shows the worldwide thermal management market trend done by BCC Research, USA[11]. It says that technology spending increased to an estimated $6.8 billion by the end of 2008 and should reach $11 billion by 2013. This means investment in thermal management is growing strongly.

Fig. 4: World thermal management market trend from the year 2007 to 2013 (Source: BCC research, USA)

Page 5: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Thermal management should be carried out for a server in order to increase the performance and reliability, as it’s consuming more power and dissipating more heat. Computational Fluid Dynamics (CFD) simulation software and advances in thermal management techniques evolved over the years to meet the cooling demands in ever growing servers. With the help of CFD software and emerging technologies in the industry, thermal engineers could provide thermal solutions to ever-growing, power-hungry servers.

Role of Servers in Data Centers

A data center is a collection of computer servers usually maintained by an enterprise to accomplish server needs far beyond the capability of one machine. These centers run enormously scaled software applications with millions of users. As data centers increasingly become the nerve centers of business and society, the demand for bigger and better ones increased. There is a growing need to produce the most computing power per square foot at the lowest possible cost in energy and resources, all of which is bringing a new level of attention and challenges.

The growth in the number of servers and the Internet is driving toward more energy consumption. As servers become more powerful, more kilowatts are needed to run and cool them. As data centers grow to unprecedented scales, attention has shifted to making servers less energy intensive. Uptime’s[12] Brill notes that while it once took 30 to 50 years for electricity costs to match the cost of the server itself, the electricity on a low-end server will now exceed the server cost itself in less than four years. The huge power draws have spurred innovation through computational fluid-dynamics modeling in the thermal management of servers, from the component level to rack level and to data center level.

Servers for Green Data Centers

A green data center is fundamentally a repository for the storage, management, and dissemination of data. In this center, the mechanical, lighting, electrical and computer systems were developed to optimize energy efficiency and environmental impact. The operational scope of a green data center design for the IT industry includes the following aspects:

Minimizing the carbon footprints of buildings

Use of alternative energy technologiesA green data center focuses on two primary objectives, namely energy conservation and environmental safety. This is achieved through optimizing the performance of cooling systems using real-time sensing technology. To resolve this issue, green data centers perform sensor-based optimal cooling.

A green data center optimizes power consumption and cooling right from the chip to the chiller. Thus, green data centers with real-time smart cooling[4] functionality can provide up to 35-40% savings in terms of power consumption.

Page 6: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Thermal Challenges in Servers

The rise in power densities, performance, and reliability constraints will produce major hurdles for the thermal management of servers. These thermal challenges will vary with different products and requirements. The following were the challenges faced in the servers:

Very high power dissipation

High ambient temperature

Stringent thermal requirements (HDDs, PSUs, Processor, etc.)

Harsh environment

Meeting strict standards and compliances

Miniaturization

Product design cycle time reduction

Minimizing the thermal cost

Feasibility solution using available resourcesConsidering all the above challenges for servers, the electronics cooling industry has come up with path breaking innovative solutions time after time.

Innovative Cooling Solutions

The solutions for thermal challenges have led to innovation in thermal management. The latest technologies in the thermal management arena function in and around the basic heat transfer modes. The development has reached a stage where the technologies overlap the basic functional industrial domains. The development of technologies is moving from single-phase heat transfer to multi-phase heat transfer, which has led to the design of advanced cooling solutions. Table 1 describes the innovative cooling solutions which can be used in servers.

Sl. No.

Technology Type

Cooling Technique Description

1 Conduction Cooling Conduction

This is an important cooling technique where more heat will be transferred to surrounding ambient through the conduction heat transfer.

2General Cooling Technologies

Heat SinkThis is an extruded surface, where extended area will help to increase the heat transfer coefficient.

Fan

When heat sink solution is not sufficient, a fan will be used to increase the heat transfer coefficient. This solution is used extensively.

Fan and Heat Sink

Combination of fan and heat sink will enhance the chances of providing a solution in less time. Both options will increase the heat transfer rate.

Page 7: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Sl. No.

Technology Type

Cooling Technique

Description

3Advanced Cooling Technologies

Cryogenics Cooling

It has applications in high power electronics and high speed high performance computing.

Refrigerant Cooling

It will be used when sub-cooled temperatures are required in a system.

Direct Immersion Cooling

The components/system will be directly immersed in a liquid bath.

Hybrid Cooling

More than one coolant will be used. Cooling will be achieved through more than two different modes of heat transfer.

Micro Channel

Liquid will flow across micro channels drilled in a chip and boiling heat transfer will occur. Heat transfer rate is high in this technique.

Spray Cooling

Liquid jet will be sprayed across a chip and heat will be removed from the chip through liquid vaporization.

Cold Plate Technologies

High conductive material with the flow passes will be in contact with the power dissipated components. Low melting point liquid will flow through the passes and remove heat from the chip.

4Smart Cooling Technologies

Electro Wetting

Heat will be removed from the chip with electron movement.

Spot Cooling

In this process, wherever hot spots are present, a cold fluid will be blown across the spot, and cooling will be achieved. This can be achieved directly or indirectly through a fan.

Vapor Chamber Cooling

In this technique, heat sink base, which is in contact with the chip, will be filled with a low melting liquid. As heat is transferred from the chip to the base, it will be dissipated to the ambient through an evaporation and condensation process at the vapor chamber.

Heat Pipes/Heat Super Conductors

These are hollow tubes filled with a low melting point liquid, with the tube wall and liquid separated by a wick. This is a very high conductive material which will help disburse the heat with low resistance. This is a very effective, noise-free technique which will be used to remove heat from the processor chips.

Page 8: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Sl. No.

Technology Type

Cooling Technique

Description

4Smart Cooling Technologies

Compact Heat Exchangers

With the miniaturization in the place, heat will be removed from the system to ambient through compact heat exchangers.

Phase Change Material

This will be used in transient cooling in conjunction with heat sink, where either heat sink base or complete heat sink will be filled with this material. As it absorbs the heat, material phase will change from solid to liquid, and there it will maintain constant for a while. Later, it reverts to solid phase. In the future, this can be used in server’s components cooling.

Micro TECs Cooling

This works on the Peltier effect. These have applications in the sub ambient cooling.

eTECs (Embedded TECs)

This has application in today’s multi-core processors where some of the cores may be very hot and the temperature must be brought down to allowable limit. TECs will be inscribed at a certain portion of the die, and these will take care of the die temperature.

Jet Impingement Cooling

A very high velocity jet will be pumped across a chip and heat will be blown away with the liquid. Rapid cooling takes place. Heat transfer rate is high.

HCL Case Studies

HCL has successfully provided innovative thermal management solutions for various servers, from the concept phase to the full product life cycle phase of the product. The following two case studies illustrate HCL’s capability in thermal management solutions for servers.

Thermal Management of High End Server

A high end server consists of as many as 13 sublevel nodes packed in a compact chassis in two rows. Sublevel node is shown in the Fig. 5. Each sublevel node consists of six IO cards, 32 DIMM cards, four multi-core processors, one voltage transformation module and two DCA channels. The total power dissipation of each node is 2.4kW and the total power dissipation of full high end server is 31.2kW. Providing a thermally feasible optimal solution to this high end server is very difficult and challenging.

Table 1: Innovative Cooling Solutions

Page 9: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 2010�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Fig. 5: Sublevel Node

Thermal Challenges

Multi-Core Processors CoolingTotal power dissipation = 920W

Less space availability and also preheated air will flow across the processors, making the design more challenging

Need to use advanced cooling technologies from the industry to cool these next generation processors

DIMMs CoolingTotal power dissipation = 736W

Space constraint between DIMMs

Less flow rate available

Voltage Transformation Module (VTM) Channel CoolingTotal power dissipation = 370W

It has one of the critical flow path designs in the server

VTMs are placed in a narrowed flow channel through which fluid enters the modules – this is one of the geometric constraints for the thermal design

Second constraint is available flow rate to cool the modules

System Pressure Drop OptimizationAs it is a very high power dissipating system, there is a need to optimize the system in such a way that maximizes the heat transfer and minimizes the pressure drop

•–

•–

•–

•–

Page 10: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 201010

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Innovative Thermal Solutions

Multi-Core Processors CoolingDesigned and optimized a customized heat sink which will fulfill the thermal requirements

Heat sink has been designed in such a way that it minimizes the weight and reduces the cost

Base and center fins of the heat sink were designed as copper, and extreme fins were made up of aluminum

Heat pipes were used to connect through the aluminum fins and the base of the heat sink

This combination of different heat sink materials and heat pipes is providing allowable temperature limit for processor

This innovative design added value to the customer

DIMMs CoolingNova chip which controls the DRAMs cooled with custom made optimized heat sink

Nova chip heat sink was designed with a constraint on the height of the heat sink (distance between any two respective DIMMs was less) and the less available flow rate across the DIMMs

The innovative design of Nova chip heat sinks across all DIMMs made thermal cooling possible for high dissipating DIMMs

Voltage Transformation Module Channel CoolingTwo different innovative concepts were proposed for VTM channel cooling

One is a generic solution with innovative optimized individual heat sinks for each VTM within a narrow flow channel. These new heat sinks were fully custom made

The other is a novel concept in which we achieved a thermal solution with the available flow rate within a narrow flow channel. This is achieved fully with conduction cooling and partly through convection cooling

Optimal pressure drop is achieved to maximize the flow rate through the channel

System Pressure Drop OptimizationFrom the start of the thermal solution, emphasis has been on the pressure drop for each and every module across the system

Pressure drop optimization helped to reduce the acoustic related problems with fully utilizing the available flow rate across the system

•–

•–

•–

•–

Page 11: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 201011

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Thermal Management of Telecommunication Server

A typical telecommunication server consists of a power supply unit, hard disk, IO board, motherboard, network switching board, and a power distribution board. This server is a transformation from 1U rack to 2U rack. The total power dissipation of the server is 640W. A thermal solution to this 2U rack server at sea level, as well as high altitude conditions, must be provided. Figure 6 shows the overview of telecommunication rack.

Fig. 6: Telecommunication Rack

Thermal Challenges

High ambient temperature

Cooling HDDs

Fan locations were fixed, most of the flow is taking the path of least resistance

Providing a solution at high altitude conditions

Meeting the strict compliances and standards of the product

Smart Cooling Solutions

Flow deflectors/ducts were used to utilize the available flow across the unit

Four dedicated flow channel were designed to control the flow behavior inside the system.

HDDs air flow channel

Network switch board and PDB flow channel

CPU flow channel

IO Board flow channel

Page 12: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 20101�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

HDD air flow channelHDDs were cooled with a flow duct, which will direct the flow from the fan to HDD without any preheat in the air

Network switch board and PDB flow channelCustom made heat sinks were designed to cool the components on these two boards

CPU flow channelProcessor was cooled with a custom made heat sink with a heat pipe and heat sink advanced cooling option

Conclusion

This paper gave a brief description of the evolution of servers and their power consumption. As servers are very important in data centers, thermal management of servers in data centers and green data centers was also highlighted with descriptions of the innovative cooling technologies which have evolved in the industry. Present thermal challenges faced by servers and the innovative solutions which emerged from the industry were discussed. Two case studies (high end server and communication server) illustrating HCL’s capability in the complete thermal management of servers were discussed. These case studies exemplify the application of advanced technologies from the electronics cooling industry to achieve optimal feasible workable thermal solutions.

References

1. http://en.wikipedia.org/wiki/Server_(computing)

2. Tom Vanderbilt, ‘Data Center Overload’, (http://www.nytimes.com/2009/06/14/magazine/14search-t.html?_r=1)

3. Jed Scaramella, ‘Enabling Technologies for Power and Cooling’, Sept 2006, IDC

4. ‘Data Center Cooling Strategies’, Technology Brief, www.hp.com

5. Qpedia, April 2009, Vol. III, Issue. II, Advanced Thermal Solutions, Inc., www.qats.com/qpedia

6. Qpedia, July 2009, Vol. III, Issue. VI, Advanced Thermal Solutions, Inc., www.qats.com/qpedia

7. Hossam Metwally, ‘Methods for Evaluating Advanced Electronics Cooling Systems’, Fluent Inc, WP-103

8. ‘Power and Cooling in the Data Center’, www.amd.com

9. Christian L. Belady, P.E, ‘In the data center, power and cooling costs more than the IT equipment it supports’ (http://electronics-cooling.com/articles/2007/feb/a3/)

10. Lisa Stapleton, ‘Getting smart about data center cooling’, November 2006, (http://www.hpl.hp.com/news/2006/oct-dec/power.html)

11. BCC Research http://www.bccresearch.com/report/SMC024E.html

12. Uptime Institute (http://www.uptimeinstitute.org/)

•–

•–

•–

Page 13: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 20101�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Acronyms

1U One Rack Height (Equals 1.75 in or 44.45mm)

2U Two Rack Height (Equals 3.5 in or 88.9mm)

CPU Central Processing Unit

CFD Computational Fluid Dynamics

DCA Distributed Converter Assembly

DIMMs Dual In-line Memory Module

DRAM Dynamic Random Access Memory

HDDs Hard Disk Drives

IO Card Input Output Card

PDB Power Distribution Board

PSU Power Supply Unit

TEC Thermo Electric Cooler

AuthorsJagadish Thammanna is a Manager and Heads the CFD and Thermal team at HCL Technologies. He has 15 years of experience in Thermal management in all the niche domains and various cross-application industries. His areas of interest include Computational Fluid Dynamics, heat transfer and scientific programming. In his vast experience, he has presented and published many national and international papers at technical symposiums.

Benarji Nalamala received his MS degree specialized in Heat Transfer from the Indian Institute of Technology Madras in 2006. He is a Thermal Analyst at HCL Technologies Ltd. He has over 4 years of experience in thermal design and management of electronic equipment in various domains. He has provided novel cooling solutions for various electronic devices with emerging cooling technologies from the industry. His areas of interest include ‘Computational Fluid Dynamics and Heat Transfer’.

Page 14: HCLT Whitepaper: Thermal Design and Management of Servers

Thermal Design and Management of Servers | February 20101�

© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

About HCL Enterprise

HCL is a $5 billion leading global Technology and IT Enterprise that comprises two companies listed in India - HCL Technologies & HCL Infosystems. Founded in 1976, HCL is one of India’s original IT garage start-ups, a pioneer of modern computing, and a global transformational enterprise today. Its range of offerings spans Product Engineering, Custom & Package Applications, BPO, IT Infrastructure Services, IT Hardware, Systems Integration, and distribution of ICT products across a wide range of focused industry verticals. The HCL team comprises over 62,000 professionals of diverse nationalities, who operate from 26 countries including over 500 points of presence in India. HCL has global partnerships with several leading Fortune 1000 firms, including leading IT and Technology firms. For more information, please visit www.hcl.in

ABOUT HCL

HCL Technologies

HCL Technologies is a leading global IT services company, working with clients in the areas that impact and redefine the core of their businesses. Since its inception into the global landscape after its IPO in 1999, HCL focuses on ‘transformational outsourcing’, underlined by innovation and value creation, and offers integrated portfolio of services including software-led IT solutions, remote infrastructure management, engineering and R&D services and BPO. HCL leverages its extensive global offshore infrastructure and network of offices in 26 countries to provide holistic, multi-service delivery in key industry verticals including Financial Services, Manufacturing, Consumer Services, Public Services and Healthcare. HCL takes pride in its philosophy of ‘Employee First’ which empowers our 54,443 transformers to create a real value for the customers. HCL Technologies, along with its subsidiaries, had consolidated revenues of US$ 2.3 billion (Rs. 11,270 crores), as on 30th September 2009. For more information, please visit www.hcltech.com