New Seaborg Queue Configuration Results May 29, 2003 David Turner NERSC User Services Group

18
New Seaborg Queue Configuration Results May 29, 2003 David Turner NERSC User Services Group [email protected] 510-486-4027

description

New Seaborg Queue Configuration Results May 29, 2003 David Turner NERSC User Services Group [email protected] 510-486-4027. Introduction. Review of LoadLeveler Class Structure NERSC-3 classes Proposed NERSC-3 Extended classes Current NERSC-3 Extended classes - PowerPoint PPT Presentation

Transcript of New Seaborg Queue Configuration Results May 29, 2003 David Turner NERSC User Services Group

Page 1: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

New Seaborg Queue Configuration

Results

May 29, 2003

David TurnerNERSC User Services Group

[email protected]

510-486-4027

Page 2: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Introduction

• Review of LoadLeveler Class Structure— NERSC-3 classes — Proposed NERSC-3 Extended classes— Current NERSC-3 Extended classes— Objectives of current class structure

• Effects of Current Structure— Connect time

wallclock * nodes * 16— Wait time

start time - submit time— Connect time / Wait time

• Conclusions

Page 3: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

NERSC-3 Class Structure

Class Nodes (Procs) Time Limit

interactive 8 (128) 30 min

debug 16 (256) 30 min

premium 128 (2,048) 8 hrs

regular 128 (2,048) 8 hrs

low 128 (2,048) 8 hrs

regular_long 32 (512) 24 hrs

Page 4: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Proposed Class Structure

Class Nodes (Procs) Time Priority

interactive 8 (128) 30 min 1

debug 24 (384) 30 min 1

premium 256 (4,096) 12 hrs 2

large 32 – 256 (512 – 4,096) 48 hrs 3

regular 256 (4,096) 12 hrs 4

regular_long 32 (512) 24 hrs 4

low 256 (4,096) 12 hrs 5

Page 5: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Other Proposed Changes

• Various limit adjustments— Increase user run limit from 4 to 6— Eliminate class limit of 7 in regular_long— Retain 1 running, 1 queued limit in regular_long

• Eliminate aging— Incompatible with class priorities

• Schedule lowest load average and smallest memory nodes first

• Tune scheduling parameters to maintain responsiveness

Page 6: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Current Class Structure

Submit Class

LLClass

Max Nodes

MaxProcs

Max Hours

RelativePriority

interactive interactive 1–8 1–128 0.5 1

debug debug 1–24 1–384 0.5 2

premium pre_1 1–31 1–496 12 7

pre_32 32–127 497–2032 48 5

pre_128 128–380 2033–6080 48 3

regular reg_1 1–31 1–496 12 8

reg_1l 1–31 1–496 24 8

reg_32 32–127 497–2032 48 6

reg_128 128–380 2033–6080 48 4

low low 1–128 1–2048 12 9

Page 7: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

General Batch Policies

• Each user may have:— 6 jobs running— 10 jobs considered for scheduling (idle state)— 30 jobs submitted

• The class run limit for reg_1l is 15 jobs• Jobs requesting 8 hours or less will complete

before scheduled outages• Jobs placed on “user hold” (status HU) will be

removed after one week

Page 8: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Objectives of Class Structure

• Allow 4096-way jobs— Current MPI maximum

• Favor “large” jobs• Provide longer time limit for “regular” jobs• Provide more resources to “long” jobs

— Allow greater access

• Provide more resources to interactive and debug jobs— As needed

All while maintaining system responsiveness

Page 9: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

N3 vs. N3E

• N3— October 1, 2002 – March 2, 2003— 153 days

• N3E— March 3, 2003 – May 20, 2003— 79 days

Page 10: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Jobs Per Week

Nodes N3 N3E

1 - 15 9208.6 8534.7

16 - 31 143.9 253.3

32 - 63 13.7 88.9

64 - 127 12.4 48.0

128+ 3.6 17.8

Total 9382.2 8942.7

Page 11: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Connect Time vs. Class

0

10

20

30

40

50

60

70

80

90

%

Low Regular Premium

N3

N3E

Charge Class

Page 12: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Connect Time vs. Size

0

10

20

30

40

50

60

70

80

%

1-15 16-31 32-63 64-127 128+

N3

N3E

Number of Nodes

Page 13: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Wait Time vs. Size

0:00:00

4:00:00

8:00:00

12:00:00

16:00:00

20:00:00

24:00:00

1-15 16-31 32-63 64-127 128+

N3

N3E

Number of Nodes

Page 14: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Connect Time / Wait Time

0.00

150.00

300.00

450.00

600.00

750.00

900.00

1-15 16-31 32-63 64-127 128+

N3

N3E

Number of Nodes

Page 15: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Conclusions

• Users running larger jobs• Users running longer jobs• Interactive and debug throughput maintained

Page 16: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Resources I

http://hpcf.nersc.gov/running_jobs/ibm/llsum/summary.php

Page 17: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

Resources II

http://hpcf.nersc.gov/running_jobs/ibm/llsum/

Page 18: New Seaborg  Queue Configuration  Results May 29, 2003 David Turner NERSC User Services Group

End of Talk

This slide intentionally left blank.