Download - Keeping Hot Chips Cool

Transcript
Page 1: Keeping Hot Chips Cool

Keeping Hot Chips Cool

Thermal Management for Green Computing

Yang Ge Professor Qinru Qiu

Page 2: Keeping Hot Chips Cool

utline• Background– Need for green computing– Adverse effects of high temperature– Thermal management techniques

• Ongoing project– Power and thermal management for single chip

cloud computer (SCC)

Page 3: Keeping Hot Chips Cool

The need for green computing• Computers consume 3%

of US energy use– Saving 1% of energy of

data center is more than saving a power plant

• Each computer generates 1 ton of CO2 every year– Equivalent to the CO2

emission of a car driving a round trip between New York and Los Angeles

Page 4: Keeping Hot Chips Cool

Power and Cost for Cooling Systems• The energy dissipation for

cooling system is high– Cooling fan power can reach

up to 51% of the overall server power budget

• The cooling cost is expensive in large data centers– The total cooling costs for

large data centers can run into tens of millions of dollars

Fans

CPU

Mem

OtherFans 51%

Mem 20%

CPU 24%

Other 6%

IBM P670 Server power breakdown

Page 5: Keeping Hot Chips Cool

Adverse effects of high temperature to VLSI Chips

• Affects the system reliability and causes permanent device failure

• Doubles leakage power consumption every 9oC increase

• Requires to increase fan speed which could reduce fan life time

Page 6: Keeping Hot Chips Cool

Thermal Management Techniques

Offline Techniques

Online Techniques

Temperature aware scheduling

Dynamic voltage frequency scaling

Temperature aware task migration

Page 7: Keeping Hot Chips Cool

Ongoing Project• Power and thermal management for

single chip cloud computer (SCC)

Page 8: Keeping Hot Chips Cool

• 24 tiles arranged in 6X4 arrays

• 2 CPUs on each tile

• A router associated with each tile

• 4 memory controllers go to on board memory

Overview of SCC Architecture

Page 9: Keeping Hot Chips Cool

• SCC and MCPC communicates over PCIe bus

• MCPC runs Ubuntu 10.04 x64 and SW from Intel

• Load Linux image on each core

• read and modify SCC registers

• Load programs on the SCC cores.

Management Console PC (MCPC)

Page 10: Keeping Hot Chips Cool

• 6 voltage domains• 24 Frequency

domains, one for each tile

• 2 temperature sensors on each tile

• Voltage and frequency can be changed separately on each domain

Power and Thermal Management

Page 11: Keeping Hot Chips Cool

hank y u