Download - 20100901_ASHRAED2468520050330

8/12/2019 20100901_ASHRAED2468520050330

1/5

About the Authors

Data Centers

Comparing Data

Center & ComputerThermal DesignBy Michael K. Patterson, Ph.D., P.E., Member ASHRAE; Robin Steinbrecher; andSteve Montgomery, Ph.D.

The design of cooling systems and

thermal solutions for todays data

centers and computers are handled

by skilled mechanical engineers using

advanced tools and methods. The engi-

neers work in two different areas: those

who are responsible for designing cooling

for computers and servers and those who

design data center cooling. Unfortunately,

a lack of understanding exists about each

others methods and design goals. This can

lead to non-optimal designs and problems

in creating a successful, reliable, energy-efficient data processing environment.

This article works to bridge this gap and

provide insight into the parameters each

engineer works with and the optimizations

they go through. A basic understanding of

each role will help their counterpart in their

designs, be it a data center, or a server.

Server Design Focus

Thermal architects are given a range

of information to begin designing the

thermal solution. They know the thermal

design power (TDP) and temperature

specifications of each component (typi-

cally junction temperature, TJ, or case

temperature TC). Using a processor as

an example, Figure 1 shows a typical

component assembly.

The processor is specified with a maxi-

mum case temperature, TC, which is used

for design purposes. In this example, the

design parameters are TDP= 103 W and

TC= 72C. Given an ambient temperature

specification (TA) = 35C, the required

thermal resistance of this example wouldneed to be equal to or lower than:

CA, required

= (TC T

A)/TDP= 0.36 C/W

(1)

Sometimes this value of CA

is not

feasible. One option to relieve the

demands of a thermal solution with a

lower thermal resistance is a higher TC.

Unfortunately, the trend for TCcontinues

to decline. Reductions in TC result in

higher performance, better reliability,

and less power used. Those advantages

are worth obtaining, making the thermal

challenge greater.One of the first parameters discussed by

the data center designer is the temperature

rise for the servers, but this value is a

secondary consideration, at best, in the

server design. As seen by Equation 1, no

consideration is given to chassis tempera-

ture rise. The thermal design is driven by

maintaining component temperatures

within specifications. The primary param-

eters being Tc, T

ambient, and

CA, actual.

The actual thermal resistance of the solu-

tion is driven by component selection, ma-

terial, configuration, and airflow volumes.

Usually, the only time that chassis TRISE

Michael K. Patterson, Ph.D., P.E., is thermal

research engineer, platform initiatives and

pathfinding, at Intels Digital Enterprise Group

in Hillsboro, Ore. Robin Steinbrecher is staff

thermal architect with Intels Server Products

Group in DuPont, Wash. Steve Montgomery,

Ph.D., is senior thermal architect at Intels Power

and Thermal Technologies Lab, Digital Enterprise

Group, DuPont, Wash.

3 8 A S H R A E J o u r n a l a s h r a e . o r g A p r i l 2 0 0 5

8/12/2019 20100901_ASHRAED2468520050330

2/5

8/12/2019 20100901_ASHRAED2468520050330

3/5

ment. Monitoring of temperature sensors is accomplished via

on-die thermal diodes or discrete thermal sensors mounted on the

printed circuit boards (PCBs). Component utilization monitoring

is accomplished through activity measurement (e.g., memory

throughput measurement by the chipset) or power measurement

of individual voltage regulators. Either of these methods resultsin calculation of component or subsystem power.

Data Center Design Focus

The data center designer faces a similar list of criteria for

the design of the center, starting with a set of requirements that

drive the design. These include:

Cost:The owner will have a set budget and the designer

must create a system within the cost limits. Capital dollars are

the primary metric. However, good designs also consider the

operational cost of running the system needed to cool the data

center. Combined, these comprise the total cost of ownership

(TCO) for the cooling systems.Equipment list:The most detailed information would include

a list of equipment in the space and how it will be racked together.

This allows for a determination of total cooling load in the space,

and the airflow volume and distribution in the space.

Caution must be taken if the equipment list is used to develop

the cooling load by summing up the total connected load. This

leads to over-design. The connected load or maximum rating of

the power supply is always greater than the maximum heat dis-

sipation possible by the sum of the components. Obtaining the

thermal load generated by the equipment from the supplier is the

only accurate way of determining the cooling requirements.

Unfortunately, the equipment list is not always available, and thedesigner will be given only a cooling load per unit area and will

need to design the systems based upon this information. Sizing

the cooling plant is straightforward when the total load is known,

but the design of the air-handling system is not as simple.

Performance:The owner will define the ultimate perfor-

mance of the space, generally given in terms of ambient tem-

perature and relative humidity. Beaty and Davidson2discusses

typical values of the space conditions and how these relate to

classes of data centers. Performance also includes values for

airflow distribution, total cooling, and percent outdoor air.

Reliability:The cooling systems reliability level is defined

and factored into equipment selection and layout of distribu-

tion systems. The reliability of the data center cooling system

requires an economic evaluation comparing the cost of the

reliability vs. the cost of the potential interruptions to center

operations. The servers protect themselves in the event of cool-

ing failure. The reliability of the cooling system should not be

justified based upon equipment protection.

Data Center Background

Experience in data center layout and configuration is helpful to

the understanding of the design issues. Consider two cases at the

limits of data center arrangement and cooling configuration:

1. A single rack in a room, and

2. A fully populated room, with racks side by side in mul-

tiple rows.

Case 2 assumes a hot-aisle/cold-aisle rack configuration,

where the cold aisle is the server airflow inlet side containing the

perforated tiles. The hot aisle is the back-to-back server outlets,discharging the warm air into the room. The hot aisle/cold aisle

is the most prevalent configuration as the arrangement prevents

mixing of inlet cooling and warm return air. The most common

airflow configuration of individual servers is front-to-back,

working directly with the hot-aisle/cold-aisle concept, but it is

not the only configuration.

Consider the rack of servers in a data processing environment.

Typically, these racks are 42U high, where 1U = 44.5 mm (1.75 in.)

A U is a commonly used unit to define the height of electronics

gear that can be rack mounted. The subject rack could hold 42 1U

servers, or 10 4U servers, or other combinations of equipment,

including power supplies, network hardware, and/or storage equip-ment. To consider the two limits, first take the described rack and

place it by itself in a reasonably sized space with some cooling

in place. The other limit occurs when this rack of equipment is

placed in a data center where the rack is one of many similar racks

in an aisle. The data center would have multiple aisles, generally

configured front-to-front and back-to-back.

Common Misconceptions

A review of misconceptions illustrates the problems and chal-

lenges facing designers of data centers. During a recent design

review of a data center cooling system, one of the engineers

claimed that the servers were designed for a 20C (36F) TRISE,inlet to outlet air temperature. This is not the case. It is possible

that there are servers that, when driven at a given airflow and

dissipating their nominal amount of power, may generate a 20C

(36F) T, but none were ever designed with that in mind.

Recall the parameters that were discussed in the section on server

design. ReducingCA

can be accomplished by increasing airflow.

However, this also has a negative effect. More powerful air mov-

ers increase cost, use more space, are louder, and consume more

energy. Increasing airflow beyond the minimum required is not a

desirable tactic. In fact, reducing the airflow as much as possible

would be of benefit in the overall server design. However, nowhere

in that optimization problem is Tacross the server considered.

Assuming a simpleTRISE

leads to another set of problems. This

implies a fixed airflow rate. As discussed earlier, most servers mon-

itor temperature at different locations in the system and modulate

airflow to keep the components within desired temperature limits.

For example, a server in a well designed data center, particularly if

located low in the rack, will likely see a TAof 20C (68F) or less.

However, the thermal solution in the server is normally designed to

handle a TAof 35C (95F). If the inlet temperature is at the lower

value, the case temperature will be lower. Then, much less airflow

is required, and if variable flow capability is built into the server,

it will run quieter and consume less power. The server airflow


8/12/2019 20100901_ASHRAED2468520050330

4/5

Figure 2: The work cell is shown in orange.

(and hence TRISE

) will vary between the TA= 20C (68F) and

35C (95F) cases, a variation described in ASHRAEs Thermal

Guideline for Data Processing Environments.The publication

provides a detailed discussion of what data should be reported

by the server manufacturer and in which configuration.

Another misconception is that the airflow in the server exhaustmust be maintained below the server ambient environmental

specification. The outlet temperature of the server does not need

to be below the allowed value for the

environment (typically 35C [95F]).

Design Decisions

To understand the problems that

can arise if the server design process

is not fully understood, revisit the two

cases introduced earlier. Consider the

fully loaded rack in a space with no

other equipment. If sufficient coolingis available in the room, the server

thermal requirements likely will be

satisfied. The servers will pull the

required amount of air to cool them,

primarily from the raised floor distribution, but if needed, from

the sides and above the server as well. It is reasonable to assume

the room is well mixed by the server and room distribution airflow.

There likely will be some variation of inlet temperature from the

bottom of the rack to the top but if sufficient space exists around

the servers it is most likely not a concern. In this situation, not

having the detailed server thermal report, as described in Refer-

ence 3, may not be problematic.At the other limit, a rack is placed in a space that is fully popu-

lated with other server racks in a row. Another row sits across the

cold aisle facing this row as well as another sitting back-to-back

on the hot-aisle side. The space covered by the single rack unit and

its associated cold-aisle and hot-aisle floor space often is called

a work cell and generally covers a 1.5 m2(16 ft2) area. The 0.6 m

0.6 m (2 ft 2 ft) perforated tile in the front, the area covered

by the rack (~0.6 m 1.3 m [~ 2 ft 4.25 ft]) and the remaining

uncovered solid floor tile in the hot-aisle side.

Consider the airflow in and around the work cell. Each work

cell needs to be able to exist as a stand-alone thermal zone.

The airflow provided to the zone comes from the perforated

tile, travels through the servers, and exhausts out the top-back

of the work cell where the hot aisle returns the warm air to

the inlet of the room air handlers. The work cell cannot bring

air into the front of the servers from the side as this would be

removing air from another work cell and shorting that zone. No

air should come in from the top either as that will bring air at a

temperature well above the desired ambient and possibly above

the specification value for TA(typically 35C [95F]). Based

on this concept of the work cell it is clear that designers must

know the airflow through the servers or else they will not be

able to adequately size the flow rate per floor tile. Conversely,

if the airflow is not adequate, the server airflow will recirculate,

causing problems for servers being fed the warmer air.

If the design basis of the data center includes the airflow

rates of the servers, certain design decisions are needed. First,

the design must provide enough total cooling capacity for the

peak, matching the central plant to the load.Another question is at what temperature to deliver the sup-

ply air. Lowering this temperature can reduce the required fan

size in the room cooling unit but also

can be problematic, as the system,

particularly in a high density data

center, must provide the minimum

(or nominal) airflow to all of the work

cells. A variant of this strategy is that

of increasing the T. Doing this al-

lows a lower airflow rate to give the

same total cooling capability. This

will yield lower capital costs but ifthe airflow rate is too low, increasing

theTwill cause recirculation. Also,

if the temperature is too low, comfort

and ergonomic issues could arise.

If the supplier has provided the right data, another decision

must be made. Should the system provide enough for the peak

airflow, or just the typical? The peak airflow rate will occur when

TA= 35C (95F) and the typical when T

A= 20 ~ 25C (68F ~

77F). Sizing the air-distribution equipment at the peak flow will

result in a robust design with flexibility, but at a high cost. Another

complication in sizing for the peak flow, particularly in dense data

centers, is that it may prove difficult to move this airflow throughthe raised floor tiles, causing an imbalance or increased leakage

elsewhere. Care must be taken to ensure the raised floor is of suf-

ficient height and an appropriate design for the higher airflows.

If the nominal airflow rate is used as the design point, the

design, installation, and operation (including floor tile selection

for balancing the distribution) must be correct for the proper

operation of the data center, but a cost savings potential exists.

It is essential to perform some level of modeling to determine

the right airflow. In this design, any time the servers ramp up

to their peak airflow rate, the racks will be recirculating warm

air from the hot aisle to feed some server inlets.

This occurs because the work cell has to satisfy its own airflow

needs (because its neighbors are also short of airflow) and, if

the servers need more air, they will receive it by recirculat-

ing. Another way to visualize this is to consider the walls of

symmetry around each work cell and recall that there is no

flux across a symmetry boundary. The servers are designed to

operate successfully at 35C (95F) inlet air temperatures so if

the prevalence of this recirculation is not too great, the design

should be successful.

If the detailed equipment list is unknown when the data center

is being designed, the airflow may be chosen based on historical

airflows for similarly loaded racks in data centers of the same


8/12/2019 20100901_ASHRAED2468520050330

5/5

Full Data Center

84.425

Temperature, C

Figure 3: Rack recirculation problem.

load and use patterns. It is important to ensure the owner is

aware of the airflow assumptions made and any limits that the

assumptions would place on equipment selection, particularly

in light of the trend towards higher power density equipment.

The airflow balancing and verification would then fall to a com-

missioning agent or the actual space owner. In either case, theairflow assumptions need to be made clear during the computer

equipment installation and floor tile set up.

Discussions with a leading facility engineering company in

Europe provide an insight to an alternate design methodology

when the equipment list is not available. A German engineering

society standard on data center design requires a fixed value of

28C at 1.8 m (82F at 6 ft) above the raised floor. This includes

the hot aisle and ensures that

if sufficient airflow is provided

to the room, all servers will

be maintained below the up-

per temperature limits even ifrecirculation occurs.

Using this approach, it is

reasonable to calculate the

total airflow in a new design

by assuming an inlet tempera-

ture of 20C (68F) (low end

of Thermal Guidelines) and

a discharge temperature of

35C (95F) (maximum inlet

temperature that should be fed

to a server through recircula-

tion) and the total cooling load of the room. A detailed designof the distribution still is required to ensure adequate airflow

at all server cold aisles.

The Solution

The link for information and what is needed for successful

design is well defined in Thermal Guidelines. Unfortunately, it is

only now becoming part of server manufacturers vocabulary.

The data center designer needs average and peak heat loads

and airflows from the equipment. The best option is to obtain

the information from the supplier. While testing is possible,

particularly if the owner already has a data center with similar

equipment, this is not a straightforward process as the server

inlet temperatures and workload can affect the airflow rate.

Thermal Guidelinesprovides information about airflow mea-

surement techniques.

The methodology of the German standard also can be used,

recognizing recirculation as a potential reality of the design

and ensuring discharge temperatures are low enough to support

continued computer operation. Finally, the worst but all-too-

common way is to use a historical value for Tand calculate

a cfm/kW based on the historical value.

In any case, the total heat load of the room and the airflow

need to be carefully considered to ensure a successful design.

Effecting Change

The use of Thermal Guidelineshas not been adopted yet

by all server manufacturers. The level of thermal information

provided from the same manufacturer can even vary from

product to product. During a recent specification review of

several different servers, one company provided extensiveairflow information, both nominal and peak, for their 1U

server but gave no information on airflow for their 4U server

in the same product line.

If data center operators and designers could convince their

information technology sourcing managers to only buy servers

that follow Thermal Guidelines(providing the needed infor-

mation) the situation would rectify itself quickly. Obviously,

that is not likely to happen,

nor should it. On the other

hand, those who own the

problem of making the data

center cooling work wouldhelp themselves by pointing

out to the procurement deci-

sion-makers that they can

have only a high degree of

confidence in their data center

designs for those servers that

adhere to the new publication.

As more customers ask for the

information, more equipment

suppliers will provide it.

SummaryThe information discussed here is intended to assist data

center designers in understanding the process by which the

thermal solution in the server is developed. Conversely, the

server thermal architect can benefit from an understanding of

the challenges in building a high density data center. Over time,

equipment manufacturers will continue to make better use of

Thermal Guidelines,which ultimately will allow more servers

to be used in the data centers with better use of this expensive

and scarce space.

References

1. Processor Spec Finder, Intel Xeon Processors. http://processor-

finder.intel.com/scripts/details.asp?sSpec=SL7PH&ProcFam=528&

PkgType=ALL&SysBusSpd=ALL&CorSpd=ALL.

2. Beaty, D. and T. Davidson. 2003 New guideline for data center

cooling.ASHRAE Journal45(12):2834.

3. TC 9.9. 2004. Thermal Guidelines for Data Processing Environ-

ments.ASHRAE Special Publications.

4. Koplin, E.C. 2003. Data center cooling. ASHRAE Journal

45(3):4653.

5. Rouhana, H. 2004. Personal communication. Mechanical Engi-

neer, M+W Zander Mission Critical Facilities, Stuttgart, Germany,

November 30.

6. Verein Deutscher Ingenieure, VDI 2054. 1994. Raumlufttech-

nische Anlagen fr DatenverarbeitungSeptember.