Slurm Migration

7/17/2019 Slurm Migration

1/19

SLURM Migration Experienceat

Lawrence Berkeley National Laboratory

Presented by

Jacqueline Scoggins

System Analysts, HPCS Group

Slurm Conference September 22-23, 2014

Lugano, Switzerland


2/19

Who are we?

National Laboratory ringing Science

Solution! to t"e world


3/19

HPCS ro!p

High Per"or#ance Co#p!ting Ser$ice%

# $e began pro%iding &'CS in 2003 !upporting 10departmental clu!ter!(

# )oday we manage t"roug"out LNL campu! and a

!ub!et of *C er+eley campu!

o%er 30 clu!ter! ranging from 4 node clu!ter! to240 node clu!ter!(

o%er 2000 u!er account!

o%er 140 proect!


4/19

&UR EN'(R&NMEN)

$e u!e t"e term - .SuperClu!ter/ Single a!ter Node - pro%i!ioning node and

!c"eduler !y!tem

ultiple interacti%e login node! ultiple Clu!ter!

# n!titutional and epartmental purc"a!ed node!

lobal &ome ile!y!tem!

latten Networ+ )opology !o e%eryt"ingcommunicate wit" t"e management5interacti%e

node!


5/19

lobal ile!y!tem nteracti%e *!er

Node! 67)' acce!!8

anagement

9 Sc"eduler

Node

Compute Node!

:;C&)


6/19

&UR SE) UP

# Node! are !tatele!! and pro%i!ioned u!ing

$arewulf 6interacti%e, compute and !toragenode!8

# =ob! could not !pan acro!! clu!ter!

# :ll u!er account! e>i!t on t"e interacti%enode!

# :ll u!er account! e>i!t on t"e !c"eduler node

but wit" nologin acce!!(# *!er account! are only added to t"e compute

node! t"ey are allowed to run on(


7/19


8/19

Moab Con"ig!ration ,cont-*.

/CL on u!er!5group! re!ource! t"ey could

u!e in t"e CL:SSC and S;C( )"eywere al!o configured wit"in tor?ue for eac"

?ueue

Li#it%0Policie% -


9/19

Moab Con"ig!ration ,cont-*.

/cco!nt /llocation% old ban+ing forc"arging C'* cycle! u!ed, !etting up rate!

for node type!5feature!

Back"ill

Stan*ing Re%er$ation%

Con*o Co#p!ting


10/19

What #a*e !% #igrate?

# )oo many unpredictable !c"eduler i!!ue!

# i!ting Support contract wa! up for renewal

# Slurm "ad t"e feature! and capabilitie! we

# needed

# Slurm arc"itectural de!ign wa! le!! comple>

#


11/19

What *i* we nee* to #igrate?

# Single !c"eduler capable of managing

department owned and in!titutional owned

node!

# Sc"eduler needed to be able to !eparate

u!er! acce!! from node! t"ey do not "a%eaccount! on(

# Sc"eduler needed to be able to a%oid

!c"eduling conflict! wit" u!er! from ot"er

group!


12/19

What *i* we nee* to #igrate?

# Sc"eduler needed to be able to "a%e limit! on

ob!5partition!5node! independently

# Needed an :llocation management !c"eme

to be able to continue c"arging u!er!

# Need to be able to u!e Node &ealt" C"ec+ met"odfor managing un"ealt"y node!


13/19

What were %o#e o" o!r Challenge%?

igrating o%er A00B u!er! and 1 different clu!ter!wit"in a 4 mont" period to a new !y!tem

mplementation de!ign

&ow will we configure t"e ?ueue!D

&ow will we get fair!"are and bac+fill to wor+D

&ow will we implement our :ccounting5C"arging

modelD 6Condo %! Non-condo u!age on in!titutional

clu!ter8

Can we control *!er :cce!! to t"e node!D


14/19

How *i* we con"ig!re %l!r#?

# Jobs run on nodes exclusively or shared

# Multiple Partitions to separate the clusters from each other

# QoSs to setup the limits for the jobs/partition and which one

a user can use within a partition# Backfill

# Fairshare

#Priority Job Scheduling

# Network topology configured to group the nodes and

switches to best place the jobs


15/19

How *i* we #igrate?

$e ran bot" !c"eduler! during t"e 4 mont" proce!!

!imultaneou!ly

$e migrated a !et of clu!ter! to !lurm or we did a !ingle

clu!ter o%er to !lurm

Create !cript! for adding account! and u!er!$e "ad to educate our u!er!


16/19

Proble#% Enco!ntere*

:ccounting "ad to create a !eparate mec"ani!m for

managing c"arging( *!ed or e>i!ting u!er data te>tfile by adding a few field! to identify a u!er wit" all of

t"e proect! t"ey were a!!ociated wit"( $e need to

"a%e +ey %alue!E u!ername 'roect id and account

name for billing t"e u!er!( Slurm doe! not "a%e t"e

a!!ociation in it $F


17/19

Proble#% enco!ntere*

ac+fill wa! not properly getting all of t"e ?ueued ob!

t"at could fit into t"e !c"eduler( &ad to c"ange !ome

of t"e parameter!

)ree$idt" I 40JK

Sc"eduler'arameter I bfcontinue

efault5in ob limit! mi!!ing t"e ability to !et t"e

default or min limit! are not a%ailable( t would be

good to "a%e at lea!t t"e default !etting a%ailable for$allcloc+ limit! in ca!e a u!er forget! to re?ue!t it(


18/19

Con*o $% Non2Con*o Co#p!ting

Condo Computing - a PI purchase nodes, and we incorporatethem into the compute node environment. The users of the

condo are subject to run for free only on their contribution of

nodes. If they run outside of their contributed condo resources,

they are subjected to the same recharge rates as all other users.

Non-condo Computing Institutional compute nodes + condo

compute nodes not in use by the condo users. Users are

charged a recharge rate of $0.01 per Service Unit (1 cent per

service unit, SU).

NOTE: We reduce the charge rate by 25% for each generation of nodes

lr1 - 0.50 SU per Core CPU hour ($0.005 per core/hr)

mako and lr2 - 0.75 SU per Core CPU hour ($0.0075 per core/hr)

lr3 1 SU per Core CPU hour ($0.01 per core/hr)


19/19

Pl!% %i*e o" SLURM

Capabilitie! for !eparating u!er! re!ource pool %ia

accounting a!!ociation

Capabilitie! for !etting up air!"are from a parent le%el

and t"e c"ildren in"erit e?ual !"are! and get

indi%idually lower priority if t"ey u!e t"e re!ource!

o%er a window of time

So far it "a! been !imple and ea!y to u!e

Slurm Migration

Documents

Transcript of Slurm Migration

SLURM User Tutorial - Lawrence Livermore National … 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 LCRM SLURM allocates nodes, starts and manages the jobs SLURM History >Jointly

what is slurm?

Job Scheduling Using Slurm

Bright Cluster Manager & SLURM - SchedMD :: SLURM Development

Slurm and Singularity Training October 2021

introduction to slurm - Read-Only

Slurm Workload Manager - Open MPI

Dynamic task migration in HPC for Exascale challenge · -Copy image to destination node-Resume job there Software Stack: - Resource manager: SLURM - MPI library: mvapich - OpenMP

Advanced UPPMAX usage More SLURM

SLURM BurstBuffer Integration - Slurm Workload Manager · PDF fileSLURM BurstBuffer Integration ... • 144 DW servers (288 SSDs ... ² Namespace - represents the metadata (called

CRUK cluster practical sessions (SLURM)

Intro to Monsoon and Slurm

Slurm at the George Washington University

Resource management with Slurm

Introduction to Abel and SLURM

Slurm at CEA

Communication-aware Job Scheduling using SLURM

20140311 Pregdb Cscs-lcg2 Migration to Slurm

Kerberos & SLURM using Auks

Slurm at UPPMAX - Uppsala University