1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU [email protected].

54
1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU [email protected]

Transcript of 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU [email protected].

Page 1: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

1

High-Performance, Power-Aware Computing

Vincent W. FreehComputer Science

[email protected]

Page 2: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

2

Acknowledgements

Students Mark E. Femal – NCSU Nandini Kappiah – NCSU Feng Pan – NCSU Robert Springer – Georgia

Faculty Vincent W. Freeh – NCSU David K. Lowenthal – Georgia

Sponsor IBM UPP Award

Page 3: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

3

The case for power management in HPC

Power/energy consumption a critical issue Energy = Heat; Heat dissipation is costly Limited power supply Non-trivial amount of money

Consequence Performance limited by available power Fewer nodes can operate concurrently

Opportunity: bottlenecks Bottleneck component limits performance of other components Reduce power of some components, not overall performance

Today, CPU is: Major power consumer (~100W), Rarely bottleneck and Scalable in power/performance (frequency & voltage)

Power/performance“gears”

Page 4: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

4

Is CPU scaling a win?

Two reasons:1. Frequency and voltage scaling

Performance reduction less than Power reduction

2. Application throughputThroughput reduction less thanPerformance reduction

Assumptions CPU large power consumer CPU driver Diminishing throughput gains

performance (freq)

pow

er

ap

plic

ati

on t

hro

ug

hput

performance (freq)

(1)

(2)

CPU powerP = ½ CVf2

Page 5: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

5

AMD Athlon-64

x86 ISA 64-bit technology Hypertransport technology – fast memory bus Performance

Slower clock frequency Shorter pipeline (12 vs. 20) SPEC2K results

2GHz AMD-64 is comparable to 2.8GHz P4P4 better on average by 10% & 30% (INT & FP)

Frequency and voltage scaling 2000 – 800 MHz 1.5 – 1.1 Volts

Page 6: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

6

LMBench results

LMBench Benchmarking suite Low-level, micro data

Test each “gear”

GearFrequency (Mhz)

Voltage

0 2000 1.5

1 1800 1.4

2 1600 1.3

3 1400 1.2

4 1200 1.1

6 800 0.9

Page 7: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

7

Operations

Page 8: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

8

Operating system functions

Page 9: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

9

Communication

Page 10: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

10

Energy-time tradeoff in HPC

Measure application performance Different than micro benchmarks Different between applications

Look at NAS Standard suite Several HP application

ScientificRegular

Page 11: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

11

Single node – EP

CPU bound:•Big time penalty•No (little) energy savings

+11%-2%

+45%+8%

+150%+52%

+25%+2%

+66%+15%

Page 12: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

12

Single node – CG

+1%-9%

+10%-20%

Not CPU bound:•Little time penalty•Large energy savings

Page 13: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

13

Operations per miss

Metric for memory pressure Must be independent of time Uses hardware performance counters

Micro-operations x86 instructions become one or more micro-operations Better measure of CPU activity

Operations per miss (subset of NAS)

Suggestion: Decrease gear as ops/miss decreases

EP BT LU MG SP CG

844 79.6 73.5 70.6 49.5 8.60

Page 14: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

14

Single node – LU

+4%-8%

+10%-10%

Modest memory pressure:Gears offer E-T tradeoff

Page 15: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

15

Ops per miss, LU

Page 16: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

16

Results – LU

Shift 0/1+1%, -6%

Auto shift+3%, -8%

Gear 1+5%, -8%

Gear 2+10%, -10%

Shift 1/2+1%, -6%

Shift 0/2+5%, -8%

Page 17: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

17

Bottlenecks

Intra-node Memory Disk

Inter-node Communication Load (im)balance

Page 18: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

18

Multiple nodes – EP

S2 = 2.0S4 = 4.0

S8 = 7.9

Perfect speedup:E constantas N increases

E = 1.02

Page 19: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

19

Multiple nodes – LU

S2 = 1.9E2 = 1.03

S4 = 3.3E4 = 1.15

S8 = 5.8E8 = 1.28

Good speedup:E-T tradeoffas N increases

S8 = 5.3E8 = 1.16

Gear 2

Page 20: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

20

Multiple nodes – MG

S2 = 1.2E2 = 1.41

S4 = 1.6E4 = 1.99

S8 = 2.7 E8 = 2.29

Poor speedup:Increased Eas N increases

Page 21: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

21

Normalized – MG

With communicationbottleneck E-Ttradeoff improvesas N increases

Page 22: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

22

Jacobi iteration

Can increase Ndecrease T and

decrease E

Page 23: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

23

Future work

We are working on inter-node bottleneck

Page 24: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

24

Safe overprovisioning

Page 25: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

25

The problem Peak power limit, P

Rack power Room/utility Heat dissipation

Static solution, number of servers is N = P/Pmax

Where Pmax maximum power of individual node

Problem Peak power > average power (Pmax > Paverage) Does not use all power – N * (Pmax - Paverage) unused Under performs – performance proportional to N Power consumption is not predictable

Page 26: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

26

Safe over provisioning in a cluster Allocate and manage power among M > N nodes

Pick M > NEg, M = P/Paverage

MPmax > P Plimit = P/M

Goal Use more power, safely under limit Reduce power (& peak CPU performance) of individual

nodes Increase overall application performance

time

pow

er Pmax

Paverage

P(t)

time

pow

er

PlimitPaverage

P(t)

Pmax

Page 27: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

27

Safe over provisioning in a cluster Benefits

Less “unused” power/energy More efficient power use

More performance under same power limitation Let P be performance Then more performance means: MP * > NP Or P */ P > N/M or P */ P > Plimit/Pmax

time

pow

er Pmax

Paverage

P(t)

time

pow

er

PlimitPaverage

P(t)

Pmax

unusedenergy

Page 28: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

28

When is this a win?

When P */ P > N/M or P */ P > Plimit/Pmax

In words: power reduction more than performance reduction

Two reasons:1. Frequency and voltage scaling2. Application throughput

performance (freq)

pow

er

ap

plic

ati

on t

hro

ug

hput

P * / P

< P av

erag

e/P m

ax

P * / P

> P av

erag

e/P m

ax

performance (freq)

(1)

(2)

Page 29: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

29

Feedback-directed, adaptive power control

Uses feedback to control power/energy consumption Given power goal Monitor energy consumption Adjust power/performance of CPU Paper: [COLP ’02]

Several policies Average power

Maximum power

Energy efficiency: select slowest gear (g) such that

Page 30: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

30

Implementation Components

Two components Integrated into one daemon process

Daemons on each node Broadcasts information at intervals Receives information and calculates Pi for next interval Controls power locally

Research issues Controlling local power

Add guarantee, bound on instantaneous power Interval length

Shorter: tighter bound on power; more responsiveLonger: less overhead

The function f(L0, …, LM)Depends on relationship between power-performance

interval (k)

Pik

Individual power limit for node i

Page 31: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

31

Results – fixed gear

0 1

23

4

5

6

Page 32: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

32

Results – dynamic power control

0 1

23

4

5

6

Page 33: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

33

Results – dynamic power control (2)

0 1

23

4

5

6

Page 34: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

34

Summary

Page 35: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

35

End

Page 36: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

36

Summary

Safe over provisioning Deploy M > N nodes More performance

Less “unused” powerMore efficient power use

Two autonomic managersLocal: built on prior researchGlobal: new, distributed algorithm

Implementation Linux AMD

Contact: Vince Freeh, 513-7196, [email protected]

Page 37: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

37

Autoshift

Page 38: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

38

Phases

Page 39: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

39

Allocate power based on energy efficiency

Allocate power to maximize throughput Maximize number of tasks completed per unit energy Using energy-time profiles

Statically generate table for each taskTuple (gear, energy/task)

Modifications Nodes exchange pending tasks Pi determined using table and population of tasks

Benefit Maximizes task throughput

Problems Must avoid starvation

Page 40: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

40

Memory bandwidth

Page 41: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

41

Power management –ICK: need better 1st slide

What Controlling power Achieving desired goal

Why Conserve energy consumption Contain instantaneous power consumption Reduce heat generation Good engineering

Page 42: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

42

Related work: Energy conservation

Goal: conserve energy Performance degradation

acceptable Usually in mobile environments

(finite energy source, battery)

Primary goal: Extend battery life

Secondary goal: Re-allocate energy Increase “value” of energy use

Tertiary goal: Increase energy efficiency More tasks per unit energy

Example Feedback-driven, energy

conservation Control average power

usage Pave= (E0 – Ef)/T

E0

Ef

T

power

freq

Page 43: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

43

Related work: Realtime DVS

Goal: Reduce energy consumption With no performance

degradation

Mechanism: Eliminate slack time in system

Savings Eidle

with F scaling Additional Etask –Etask’

with V scaling

T

P

Etask

deadlin

e

Pmax

T

P

Etask’ deadlin

e

Pmax

Eidle

Page 44: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

44

Related work: Fixed installations

Goal: Reduce cost (in heat generation or $) Goal is not to conserve a battery

Mechanisms Scaling

Fine-grain – DVSCoarse-grain – power down

Load balancing

Page 45: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

45

Single node – MG

Page 46: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

46

Single node – EP

Page 47: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

47

Single node – LU

Page 48: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

48

Power, energy, heat – oh, my

Relationship E = P * T H E Thus: control power

Goal Conserve (reduce) energy consumption Reduce heat generation Regulate instantaneous power consumption

Situations (benefits) Mobile/embedded computing (finite energy store) Desktops (save $) Servers, etc (increase performance)

Page 49: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

49

Power usage

CPU power Dominated by dynamic power

System power dominated by CPU Disk Memory

CPU notes Scalable Driver of other system Measure of performance

performance (freq)

pow

er

CMOS dynamic power equation:

P = ½CfV2

Page 50: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

50

Power management in HPC

Goals Reduce heat generation (and $) Increase performance

Mechanisms Scaling Feedback Load balancing

Page 51: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

51

Single node – MG

+6%-7% +12%

-8%

Modest memory pressure:Gears offer E-T tradeoff

Page 52: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

52

Power management vs. energy conservation Power management is mechanism Energy conservation is policy

Two elements Energy efficiency

Ie, Decrease energy consumed per task (Instantaneous) power consumption

Ie, Limit maximum Watts used

Power-performance tradeoff Less power & less performance Ultimately energy-time

Power management

2GHz800MHz

AMD system

6 gears

Page 53: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

53

Autonomic managers

Implementation uses two autonomic managers Local – power control Global – power allocation

Local Uses prior research project (new implementation) Requires new policy Daemon process

Reads power meterAdjusts processor performance gear (freq)

Global At regular intervals

Collects appropriate information from all nodesAllocates power budget for next quantum

Optimize for one of several objectives

Page 54: 1 High-Performance, Power-Aware Computing Vincent W. Freeh Computer Science NCSU vin@csc.ncsu.edu.

54

Example: Load imbalance

Uniform allocation of power Pi = Plimit = P/M, for node i Not ideal if nodes unevenly loaded

Tasks execute more slowly on busy nodesLightly loaded nodes may not use all power

Allocate power based on load* At regular intervals, nodes exchange load information Each computes individual power limit for next interval (k)

*Note: Load is one of several possible objective functions.

individual power limit for node i at interval k

Ensure: