U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with...

25
U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling

Transcript of U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with...

Page 1: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

U.S. Army Research, Development and Engineering Command

Presentation Date

Early Experiences with Energy-Aware (EAS) Scheduling

Page 2: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Agenda

• Project Motivation

• Overview

• Efforts To date

• Results

• Ongoing Efforts

• Summary

Page 3: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Project Motivation

• With the average HPCMP HPC System Utilization in the 80% range, opportunities exist to power off idle compute nodes, reducing energy costs for the idle nodes and associated cooling.

• Scheduling of reservations, system maintenance events, and scheduling of large node count jobs all create opportunities to power off idle nodes.

• Idle nodes consume approximately 50% of the power of an active node.

Page 4: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Overview

• Project initiated via HPCMP Energy Efficiency Initiatives 8/11.• Motivation: Explore the feasibility of powering off resources that are

idle to save energy.

– There are commercial implementations on cyclic workloads that have proven successful, but is it feasible for large non-cyclic HPC workloads?

• Theory: Typically HPCMP systems run ~ 80% utilization. If we can power off 10% of system’s nodes there would be significant energy and monetary savings to the program. Annual HPCMP kWh saving by

x% of systems for y hours/day.

Architecture 10% 12h/D 15% 12h/D 10% 24h/D 15% 24h/DXE6 224012 336018 448024 672036Altix 263725 395588 527450 791175

Subtotal 487737 731606 975474 1463211Site (1.8 PUE) 390190 585284 780379 1170569

Total 877927 1316890 1755853 2633780

Page 5: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Overview (continued)

• Issues considered

– No reduction in service delivery or increased expansion factors

– Software/system viability: Architecture considerations: e.g. CRAY XT5 architecture will not recover from off node

– No significant performance degradation for jobs

– Idle nodes use 50%-60% of active nodes

– System reliability affects

– Cooling requirement reductions

Page 6: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Efforts To date

• Altair base code for commercial application significantly adapted to the scale of HPC systems/job mix, and non-cyclic job model.

• Standardized system interfaces created for system control.

• Code updated to be backfilled aware.

• Significant development/testing on ARL Test & Development Systems (TDS)– passive mode– live mode

• Network Affects test on Tana (CRAY XE6, 256 cores)

Page 7: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Efforts To date (continued)

• Production on Army Research Laboratory DoD Supercomputing Resource Center (ARL DSRC) systems

– Harold (SGI Altix ICE Quad Core Xeon Cluster 10,752 cores)– TOW (SGI Altix ICE Quad Core Xeon Cluster 6,656 cores)– ICECUBE (SGI Altix ICE Quad Core Xeon Cluster TDS, 96 cores)– MJM (Linux Networx Evolocity II Xeon Cluster, 3.248 cores)

• Development applicable to Mana: US and other IB clusters– Abutil (Appro Xtreme-X Series Cluster 1,824 cores)– Cabutil (Appro Xtreme-X Series Cluster 1,008 cores)

• Production on Maui High Performance Computing Center (MHPCC) system

– Mana (Dell Quad Core Xeon Cluster (9,216 cores)• Transitioned slowly to “live mode” starting 25 Apr 12

– Muutil (Appro Quad Core Xeon Cluster • EAS/PBSPro 11.2 installed. Ran in passive mode for 2 months. Waiting for approval.

Page 8: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Harold (SGI Altix ICE 10,752 cores) Results

4/1

4/4

4/7

4/10

4/13

4/16

4/19

4/22

4/25

4/28 5/1

5/4

5/7

5/10

5/13

5/16

5/19

5/22

5/25

5/28

5/31 6/3

6/6

6/9

6/12

6/15

6/18

6/21

6/24

6/27

6/30 7/3

7/6

7/9

7/12

7/15

7/18

7/21

7/24

7/27

7/30 8/2

8/5

8/8

8/11

8/14

8/17

8/20

8/23

8/26

8/29

0

100

200

300

400

500

600

700

800

900

1000

HAROLD Number of Nodes Powered-Off

Harold highest number of powered off nodes Harold Average number of powered off nodes

Nu

mb

er

of

No

de

s

Page 9: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Harold Results (continued)

• Moved reservation nodes from racks 17-21 to racks 1-5.– Improved backfill jobs and freed up nodes for longer running jobs

• Scheduled PM on 5/27 shows nodes powered down as no pending jobs could complete prior to PM reservation.

• Emergency PM on 6/12 shows nodes powered down due to system inactivity.

Page 10: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Harold Results (continued)

• Harold EAS Savings Harold Utilization– Apr 11,498 hrs ( 1,690 kWh) - 95.5% for Apr (7,396.061 hrs)– May 31,604 hrs ( 4,645 kWh) - 89.9% for May (7,194,403 hrs)– Jun 30,102 hrs ( 4,424 kWh)– Jul 1,273 hrs ( 187 kWh)– Aug 18,385 hrs ( 3,280 kWh)

– Limited Opportunity to save energy on Harold at its current level of utilization.

• Estimated Yearly Savings is $6,000

Page 11: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Sep-09

Nov-09Jan-10

Mar-10

May-10Jul-10

Sep-10

Nov-10Jan-11

Mar-11

May-11Jul-11

Sep-11

Nov-11Jan-12

Mar-12

May-12

0

10

20

30

40

50

60

70

80

90

100

Month

Utiliz

ation

(%)

Vendor ELT

CAP

Start of Production

(6 months to exceed

80%)

OS Upgrade

Harold took 7 months from start of production to

reach 80% utilization.

$200K of power cost expended over life cycle for

idle nodes.

Harold Lifecycle Utilization

Page 12: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

TOW (SGI Altix ICE 6,656 cores) Results

4/1/

2012

4/4/

2012

4/7/

2012

4/10

/201

24/

13/2

012

4/16

/201

24/

19/2

012

4/22

/201

24/

25/2

012

4/28

/201

25/

1/20

125/

4/20

125/

7/20

125/

10/2

012

5/13

/201

25/

16/2

012

5/19

/201

25/

22/2

012

5/25

/201

25/

28/2

012

5/31

/201

26/

3/20

126/

6/20

126/

9/20

126/

12/2

012

6/15

/201

26/

18/2

012

6/21

/201

26/

24/2

012

6/27

/201

26/

30/2

012

7/3/

2012

7/6/

2012

7/9/

2012

7/12

/201

27/

15/2

012

7/18

/201

27/

21/2

012

7/24

/201

27/

27/2

012

7/30

/201

28/

2/20

128/

5/20

128/

8/20

128/

11/2

012

8/14

/201

28/

17/2

012

8/20

/201

28/

23/2

012

8/26

/201

28/

29/2

012

0

50

100

150

200

250

TOWNumber of Nodes Powered-Off

TOW highest number of powered off NODES TOW Average number of powered off NODES

Nu

mb

er

of

No

de

s

Page 13: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

TOW Results (continued)

• Lustre issues with too many nodes powering-on

• EAS disabled while node startup issues addressed. Lustre upgrade and tuning/EAS resumed on a limited set of nodes.

• Low EAS usage since 5/11 with increased workload/130 node dedicated project

• TOW EAS savings TOW Utilization– Apr 15,555 hrs ( 2,286 kWh ) - 59.9% for Apr (2,872,756)– May 1,333 hrs ( 195 kWh ) - 73.5% for May (3,617,621)– Jun 6,335 hrs ( 931 kWh )– Jul 4,023 hrs ( 591 kWh)– Aug 7,337 hrs (1,078 kWh)

Page 14: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Abutil (Appro Xtreme Cluster 1,408 cores) Results

4/1/

2012

4/5/

2012

4/9/

2012

4/13

/201

2

4/17

/201

2

4/21

/201

2

4/25

/201

2

4/29

/201

2

5/3/

2012

5/7/

2012

5/11

/201

2

5/15

/201

2

5/19

/201

2

5/23

/201

2

5/27

/201

2

5/31

/201

2

6/4/

2012

6/8/

2012

6/12

/201

2

6/16

/201

2

6/20

/201

2

6/24

/201

2

6/28

/201

2

7/2/

2012

7/6/

2012

7/10

/201

2

7/14

/201

2

7/18

/201

2

7/22

/201

2

7/26

/201

2

7/30

/201

2

8/3/

2012

8/7/

2012

8/11

/201

2

8/15

/201

2

8/19

/201

2

8/23

/201

2

8/27

/201

2

8/31

/201

2

0

10

20

30

40

50

60

70

ABUTILNumber of Nodes Powered-Off

ABUTIL highest number of powered off nodes ABUTIL Average number of powered off nodes

Nu

mb

er

of

NO

de

s

Page 15: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Abutil Results (continued)

• EAS Configured on Abutil in February

• EAS was disabled for short period for user code testing

• Abutil EAS Savings Abutil Utilization– Apr 29,267 hrs ( 11,024 kWh ) - 3.9% for Apr ( 49,924 hrs )– May 38,044 hrs ( 13,934 kWh ) - 3.6% for May ( 47,803 hrs )– Jun 37,007 hrs ( 13,544 kWh )– Jul 21,602 hrs ( 7,819 kWh ) – Aug 21,195 hrs ( 7.826 kWh )

– July timeframe implemented keep-alive feature and increased the powered-on nodes.

• Daily Average of 53 of 88 Nodes Powered-off

• Estimated Yearly Savings is approx $22k

Page 16: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

ARL Cabutil Results

4/1/20

124/4/20

124/7/20

124/10

/201

24/13

/201

24/16

/201

24/19

/201

24/22

/201

24/25

/201

24/28

/201

25/1/20

125/4/20

125/7/20

125/10

/201

25/13

/201

25/16

/201

25/19

/201

25/22

/201

25/25

/201

25/28

/201

25/31

/201

26/3/20

126/6/20

126/9/20

126/12

/201

26/15

/201

26/18

/201

26/21

/201

26/24

/201

26/27

/201

26/30

/201

27/3/20

127/6/20

127/9/20

127/12

/201

27/15

/201

27/18

/201

27/21

/201

27/24

/201

27/27

/201

27/30

/201

28/2/20

128/5/20

128/8/20

128/11

/201

28/14

/201

28/17

/201

28/20

/201

28/23

/201

28/26

/201

28/29

/201

2

0

5

10

15

20

25

30

35

40

45

50

CABUTILNumber of Nodes Powered-Off

CABUTIL highest number of powered off nodes CABUTIL Average number of powered off nodes

Nu

mb

er

of

No

de

s

Page 17: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

ARL Cabutil (continued)

• EAS Configured on Cabutil in March

• EAS Disabled for a Short Period to Test User Code

• Cabutil EAS Savings Cabutil Utilization– Apr 18,900 hrs ( 6,128 kWh ) - 0.8% for Apr ( 5,541 hrs)– May 18,701 hrs ( 6,063 kWh ) - 7.9% for May ( 58,163 hrs)– Jun 20,840 hrs ( 7,383 kWh )– Jul 19,530 hrs ( 6,798 kWh ) – Aug 10,306 hrs ( 3,390 kWh)

• Estimated Yearly Savings is approx $11k

• Daily Average of 37 of 44 nodes powered-off

Page 18: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

MHPCC Mana Results

4/1/20

12

4/3/20

12

4/5/20

12

4/7/20

12

4/9/20

12

4/11

/201

2

4/13

/201

2

4/15

/201

2

4/17

/201

2

4/19

/201

2

4/21

/201

2

4/23

/201

2

4/25

/201

2

4/27

/201

2

4/29

/201

2

5/1/20

12

5/3/20

12

5/5/20

12

5/7/20

12

5/9/20

12

5/11

/201

2

5/13

/201

2

5/15

/201

2

5/17

/201

2

5/19

/201

2

5/21

/201

2

5/23

/201

2

5/25

/201

2

5/27

/201

2

5/29

/201

2

5/31

/201

2

6/2/20

12

6/4/20

12

6/6/20

12

6/8/20

12

6/10

/201

2

6/12

/201

2

0

200

400

600

800

1000

1200

MANANumber of Nodes Powered-Off

MANA highest number of powered off nodes

MANA Average number of powered off nodes

Nu

mb

er

of

No

de

s

Page 19: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

MHPCC Mana Results

4/1/20

124/4/20

124/7/20

124/10

/201

24/13

/201

24/16

/201

24/19

/201

24/22

/201

24/25

/201

24/28

/201

25/1/20

125/4/20

125/7/20

125/10

/201

25/13

/201

25/16

/201

25/19

/201

25/22

/201

25/25

/201

25/28

/201

25/31

/201

26/3/20

126/6/20

126/9/20

126/12

/201

26/15

/201

26/18

/201

26/21

/201

26/24

/201

26/27

/201

26/30

/201

27/3/20

127/6/20

127/9/20

127/12

/201

27/15

/201

27/18

/201

27/21

/201

27/24

/201

27/27

/201

27/30

/201

28/2/20

128/5/20

128/8/20

128/11

/201

28/14

/201

28/17

/201

28/20

/201

28/23

/201

28/26

/201

28/29

/201

2

0

200

400

600

800

1000

1200

MANANumber of Nodes Powered-Off

Series1 MANA Average number of powered off nodes

Nu

mb

er

of

No

de

s

Page 20: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

MHPCC Mana Results(continued)

Apr-12

May-12

Jun-12

Jul-1

2

Aug-12

0

100,000

200,000

300,000

400,000

500,000

600,000

495,022

247,144

203,243

261,925

212,803

28,21614,086 11,584 19,325 12,129

MANAHours Powered Off & Cost Savings

MANA hours powered off for the month

MANA estimated Dollar savings at site

Ho

urs

po

we

red

off

es

tim

ate

d d

olla

r s

av

ing

s

Page 21: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

MHPCC Mana Results (continued)

• EAS Configured on Mana in March • Mana Monthly Savings:

– Apr: 495,022 hrs ( 74,253 kWh ) - virtual savings– May 247,144 hrs ( 37,071 kWh ) - about 50% of the system was under EAS control– Jun 203,243 hrs ( 30,486 kWh ) - less than 50% was under EAS control– Jul 261,925 hrs ( 39,273 kWh )– Aug 212,803 hrs ( 31,920 kWh )

• Mana Estimated Yearly Savings approx $160k

Page 22: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Ongoing Efforts

• Upgrade to PBSPro 11.3– Currently testing at ARL

• Utility Server– Implement EAS on AFRL, ERDC and NAVY DSRC Utility Servers

• Implement EAS on TI-12 systems across the HPCMP

Page 23: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Summary

• EAS found to be feasible on many architectures.

• EAS needs to be more tightly integrated with scheduler.

• EAS appears to be compatible with the IBM iDataPlex TI-12 systems.

• HPC Vendor resistance has been encountered but EAS node shutdown/startup provides an opportunity to perform proactive hardware checks, improving long term reliability.

Page 24: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Summary (continued)

• Measured power savings are encouraging.

• Thanks to all!

– ARL (Project Sponsorship & Executive Agent).– Lockheed Martin (LM) (Architectural Implementation, Algorithm Development,

Performance/Feasibility Evaluation).– Altair (Algorithm Development, Scheduler Interface)– Instrumental (Architectural Implementation, HPCMP Coordination)– Air Force Research Laboratory Maui High Performance Computing Center

(MHPCC)– ORS: XE6 evaluation.– SGI (System monitoring, Architectural Implementation).– APPRO: Support for EAS on HPCMP Utility Servers

Page 25: U.S. Army Research, Development and Engineering Command Presentation Date Early Experiences with Energy-Aware (EAS) Scheduling.

Questions

Questions?