Green Computing for a Clean Tomorrow Improve efficiency, reliability, availability, and usability of...

1
Green Computing for a Clean Tomorrow Improve efficiency, reliability, availability, and usability of computing systems. Sacrifice a bit of raw speed to reduce power & energy consumption. Improve overall throughput as the system will always be available, i.e., effectively no downtime. Reduce total cost of ownership & increase return on investment. Crude Analogy Formula One Race Car: Wins raw performance but reliability is so poor that it requires frequent maintenance. Throughput low. Honda S2000: Loses raw performance but high reliability results in high throughput (i.e., miles driven/month answers/month). 1. Power Consumption & Heat Generation Hurt Reliability, Availability, & Total Cost of Ownership 2. Electrical Power for Computing Costs $$ $ Earth Simulator: 12 MW/year $10M/year World’s Processors: 1.3 GW/year $1B/year “Hiding in Plain Sight, Google Seeks More Power,” The New York Times, June 14, 2006. 3. Computing “Contributes” to Global Warming Motivation Goal: Deliver high performance while reducing power & energy consumption and improving reliability. Prof. Wu FENG, [email protected] College of Engineering, Depts. of CS & ECE Just the processors (i.e., CPUs) in PCs ≈ 40 Hoover Dams (Estimated power consumption of PCs ≈ 120 Hoover Dams) 2001 to 2006 1 10 100 1000 1.5 1 0.7 0.5 0.35 0.25 0.18 0.13 0.1 0.07 I386 – 1 watt I486 – 2 watts Pentium – 14 watts Pentium Pro – 30 watts Pentium II – 35 watts Pentium III – 35 watts Chip Maximum Power in watts/cm 2 Surpassed Heating Plate Not too long to reach Nuclear Reactor Year Pentium 4 – 75 watts 1985 1995 2001 Itanium – 130 watts Source: Fred Pollack, Intel. New Microprocessor Challenges in the Coming Generations of CMOS Technologies, MICRO32 and Transmeta System s Processo rs Reliability & Availability ASCI Q 8,192 MTBI: 6.5 hrs. 114 unplanned outages/month. – Outage sources: storage, CPU, memory. ASCI White 8,192 MTBF: 5 hrs. (2001) and 40 hrs. (2003). – Outage sources: storage, CPU, 3 rd - party HW. NERSC Seabor g 6,656 MTBI: 14 days. MTTR: 3.3 hrs. – SW is the main outage source. Availability: 98.74%. PSC Lemieu x 3,016 MTBI: 9.7 hrs. Availability: 98.33%. Google ~450,000 550+ reboots/day; 2-3% machines replaced/yr. – Outage sources: storage, memory. Availability: ~100%. MTBI: mean time between interrupt; MTBF: mean time between failure; MTTR: mean time to restore “Making a Case for a Green500 List,” 20 th IEEE Int’l Parallel & Distributed Processing Symp. , Apr. 2006. “A Power-Aware Run-Time System for High-Performance Computing,” SC|05, Nov. 2005. “The Importance of Being Low Power in High-Performance Computing,” CTWatch Quarterly (NSF) , 1(3):12-20, Aug. 2005. “Green Destiny and Its Evolving Parts,” Innovative Supercomputer Architecture Award , 19 th Int’l Supercomputer Conf., Jun. 2004. “Green Destiny + mpiBLAST = Bioinfomagic,” 10 th Int’l Conf. on Parallel Computing (ParCo) , Sept. 2003. “Honey, I Shrunk the Beowulf!” 31 st Int’l Conf. on Parallel Processing, Aug. 2002. Self-Adapting Software for Energy Efficiency Conserve power & energy WHILE programs run. Sequential Codes Parallel Codes Energy savings and performance improvement! relative time / relative energy with respect to total execution time and system energy usage 1. Low-Power, High-Performance Computing Green Destiny : A 240-Node Supercomputer in 5 Sq. Ft. with a 3.2-kW Power Envelope Reliability Operating Environment: A dusty 85°-90° F warehouse. No machine room. No unscheduled downtime in its 24-month lifetime. Performance: A Top500 Supercomputer (circa March 2002). Linpack: 101 Gflops. 4 40 Arrenhius Equation (applied to microelectronics) & Twenty Years of Empirical Data For every 10°C increase in temperature, the failure rate of the system doubles. Reduce power consumption Reduce system temperature Reduce failure rate Observation The Project: Supercomputing in Small Spaces Green Destiny : A 240-node Supercomputer in Five Sq. Ft. 2. Power-Aware, High-Performance Computing Only Difference? The Processors Green Destiny: Low-Power Supercomputer Green Destiny “Replica”: Traditional Supercomputer 3.2 kW 30.0 kW Hypothesis Selected Publications Many commodity technologies support dynamic voltage & frequency scaling (DVFS) , which allows changes to the processor voltage and frequency at run time. A computing system can trade off processor performance for power reduction. Power V 2 f, where V is the supply voltage of the processor and f is its frequency. Processor performance frequency. Approach: Intelligent DVFS Scheduling Determine when to adjust the voltage-frequency setting and what to adjust it to. Approaches Results Laborato ry NAS/NPB 3.2 – MPI, C.16 Featured in The New York Times, CNN, and BBC News Now in The Computer History Museum

Transcript of Green Computing for a Clean Tomorrow Improve efficiency, reliability, availability, and usability of...

Page 1: Green Computing for a Clean Tomorrow Improve efficiency, reliability, availability, and usability of computing systems.  Sacrifice a bit of raw speed.

Green Computing for a Clean Tomorrow

• Improve efficiency, reliability, availability, and usability of computing systems.

Sacrifice a bit of raw speed to reduce power & energy consumption.

Improve overall throughput as the system will always be available, i.e., effectively no downtime.

• Reduce total cost of ownership & increase return on investment.

Crude Analogy• Formula One Race Car: Wins raw performance but reliability is so poor that

it requires frequent maintenance. Throughput low.• Honda S2000: Loses raw performance but high reliability results in high

throughput (i.e., miles driven/month answers/month).

1. Power Consumption & Heat Generation Hurt Reliability, Availability, & Total Cost of Ownership

2. Electrical Power for Computing Costs $$$• Earth Simulator: 12 MW/year $10M/year• World’s Processors: 1.3 GW/year

$1B/year• “Hiding in Plain Sight, Google Seeks More

Power,” The New York Times, June 14, 2006.

3. Computing “Contributes” to Global Warming

Motivation

Goal: Deliver high performance while reducing power & energy consumption and improving reliability.

Prof. Wu FENG, [email protected] of Engineering, Depts. of CS & ECE

Just the processors (i.e., CPUs) in PCs ≈ 40 Hoover Dams(Estimated power consumption of PCs ≈ 120 Hoover

Dams)

2001 to 2006

1

10

100

1000

1.5 1 0.7 0.5 0.35 0.25 0.18 0.13 0.1 0.07

I386 – 1 wattI486 – 2 watts

Pentium – 14 watts

Pentium Pro – 30 wattsPentium II – 35 watts

Pentium III – 35 watts

Chip Maximum Power in watts/cm2

SurpassedHeating Plate

Not too long to reach

Nuclear Reactor

Year

Pentium 4 – 75 watts

1985 1995 2001

Itanium – 130 watts

Source: Fred Pollack, Intel. New Microprocessor Challenges in the Coming Generations of CMOS Technologies, MICRO32 and Transmeta

Systems Processors Reliability & Availability

ASCI Q 8,192 MTBI: 6.5 hrs. 114 unplanned outages/month.– Outage sources: storage, CPU, memory.

ASCI White

8,192 MTBF: 5 hrs. (2001) and 40 hrs. (2003).– Outage sources: storage, CPU, 3rd-party HW.

NERSC Seaborg

6,656 MTBI: 14 days. MTTR: 3.3 hrs.– SW is the main outage source.

Availability: 98.74%.

PSC Lemieux

3,016 MTBI: 9.7 hrs.Availability: 98.33%.

Google ~450,000 550+ reboots/day; 2-3% machines replaced/yr.– Outage sources: storage, memory.

Availability: ~100%.

MTBI: mean time between interrupt; MTBF: mean time between failure; MTTR: mean time to restore

• “Making a Case for a Green500 List,” 20th IEEE Int’l Parallel & Distributed Processing Symp., Apr. 2006.• “A Power-Aware Run-Time System for High-Performance Computing,” SC|05, Nov. 2005. • “The Importance of Being Low Power in High-Performance Computing,” CTWatch Quarterly (NSF), 1(3):12-20, Aug. 2005. • “Green Destiny and Its Evolving Parts,” Innovative Supercomputer Architecture Award, 19th Int’l Supercomputer Conf.,

Jun. 2004. • “Green Destiny + mpiBLAST = Bioinfomagic,” 10th Int’l Conf. on Parallel Computing (ParCo), Sept. 2003.• “Honey, I Shrunk the Beowulf!” 31st Int’l Conf. on Parallel Processing, Aug. 2002.

Self-Adapting Software for Energy Efficiency Conserve power & energy WHILE programs run.

Sequ

en

tial C

od

es

Para

llel C

odes

Energy savings and performance improvement!

relative time / relative energy with respect to total execution time and system energy usage

1. Low-Power, High-Performance Computing Green Destiny : A 240-Node Supercomputer in 5 Sq. Ft. with a 3.2-kW Power

Envelope

Reliability Operating Environment: A dusty 85°-90° F warehouse. No machine room. No unscheduled downtime in its 24-month lifetime.

Performance: A Top500 Supercomputer (circa March 2002). Linpack: 101 Gflops.

4

40

Arrenhius Equation (applied to microelectronics) & Twenty Years of Empirical Data

For every 10°C increase in temperature, the failure rate of the system doubles.

Reduce power consumption Reduce system temperature Reduce failure rate

Observation

The Project:Supercomputing in Small Spaces

Green Destiny : A 240-node Supercomputer in Five Sq. Ft.

2. Power-Aware, High-Performance Computing

Only Difference? The Processors

Green Destiny: Low-Power Supercomputer

Green Destiny “Replica”: Traditional Supercomputer

3.2

kW

30

.0 k

W

Hypothesis

Selected Publications

Many commodity technologies support dynamic voltage & frequency scaling (DVFS), which allows changes to the processor voltage and frequency at run time.

A computing system can trade off processor performance for power reduction. Power V2f, where V is the supply voltage of the processor and f is its

frequency. Processor performance frequency.

Approach: Intelligent DVFS Scheduling Determine when to adjust the voltage-frequency setting and what to adjust it to.

Approaches

Results

Laboratory

NAS/NPB 3.2 – MPI, C.16

Featured in The New York Times, CNN, and BBC NewsNow in The Computer History Museum