Post on 27-Jul-2020
Requesting Resources on an
HPC Facility
Michael Griffiths and Norbert Gyenge
Corporate Information and Computing Services
The University of Sheffield
www.sheffield.ac.uk/cics/research
(Using the Slurm Workload Manager)
1. Understand what High Performance Computing is
2. Be able to access remote HPC Systems by different methods
3. Run Applications on a remote HPC system
4. Manage files using the Linux Operating Systems
5. Know how to use the different kinds of file storage systems
6. Run applications using a Scheduling System or Workload Manager
7. Know how to get more resources and how to get resources dedicated
for your research
8. Know how to enhance your research through shell scripting
9. Know how to get help and training
Review: Objectives
1. Using the Job Scheduler – Interactive Jobs
2. Batch Jobs
3. Task arrays
4. Running Parallel Jobs
5. Beyond Bessemer Accessing tier 2 resources
6. Course examples available using
1. git clone --single-branch --branch bessemer https://github.com/rcgsheffield/hpc_intro
Outline
1. USING THE JOB
SCHEDULER• Interactive Jobs
• https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#request-an-interactive-shell
• Batch Jobs
• https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#submitting-non-interactive-jobs
• SLURM Documentation
• https://slurm.schedmd.com/pdfs/summary.pdf
• https://slurm.schedmd.com/man_index.html
RUNNING JOBS
A NOTE ON INTERACTIVE JOBS
• Software that requires intensive computing should be run on the worker nodes and not the head node.
• You should run compute intensive interactive jobs on the worker nodes by using the command
• srun --pty bash –I
• Maximum ( and also default) time limit for interactive jobs is 8 hours.
SLURM• Bessemer login nodes are gateways to the cluster of worker
nodes.
• Login nodes’ main purpose is to allow access to the worker nodes but NOT to run cpu intensive programs.
• All cpu intensive computations must be performed on the worker nodes. This is achieved by;
• srun --pty bash –I for interactive jobs
• sbatch submission.sh for batch jobs
• Once you log into Bessmer, taking advantage of the power of a worker-node for interactive work is done simply by typing. srun --pty bash –I and working in the shell window. The next set of slides assume that you are already working on one of the worker node.
PRACTICE SESSION 1: RUNNING
APPLICATIONS ON BESSEMER
(PROBLEM 1)• Case Studies
• Analysis of Patient Inflammation Data
• Running an R application how to submit jobs and run R interactively
• List available and loaded modules load the module for the R package
• Start the R Application and plot the inflammation data
MANAGING YOUR JOBS
SLURM OVERVIEWSLURM is the workload management system, job scheduling and
batch control system. (Others available such as PBS, Torque/Maui,
Platform LSF )
• Starts up interactive jobs on available workers
• Schedules all batch orientated ‘i.e. non-interactive’ jobs
• Fault Tolerant, highly scalable cluster management and job
scheduling system
• Optimizes resource utilization
SCHEDULING BATCH JOBS ON THE CLUSTER
SLURM
worker
node
SLURM
worker
node
SLURM
worker
node
SLURM
worker
node
SLURM
worker
node
SLURM MASTER
node
Queue-A Queue-B Queue-C
AS
lot 1
A S
lot 2
B S
lot 1
C S
lot 1
C S
lot 2
C S
lot 3
B S
lot 1
B S
lot 2
B S
lot 3
B S
lot
1
C S
lot
1
C S
lot 2
A S
lot 1
B S
lot 1
C S
lot 1
Queues
Policies
Priorities
Share/Tickets
Resources
Users/ProjectsJOB Y JOB Z
JOB X
JOB U
JOB OJOB N
MANAGING JOBS MONITORING AND CONTROLLING
YOUR JOBS
• There are a number of commands for querying and modifying the status of a job running or waiting to run. These are;
• squeue (query job status)
• squeue –jobs jobid
• squeue –-users “username”
• squeue –-users “*”
• scancel (delete a job)
• scancel jobid
DEMONSTRATION 1
Using the R package to analyse patient data
sbatch example:
sbatch myjob
the first few lines of the submit script myjob contains -
$!/bin/bash
#SBATCH --time=10:00:00
#SBATCH --output myoutputfile
#SBATCH –error myerroroutput
and you simply type; SBATCH myjob
Running Jobs batch job example
PRACTICE SESSION: SUBMITTING JOBS TO BESSEMER
(PROBLEM 2 & 3)
• Patient Inflammation Study run the R example as a batch job
• Case Study
• Fish population simulation
• Submitting jobs to SLURM
• Instructions are in the readme file in the slurm folder of the course examples
• From an interactive session
• Load the compiler module
• Compile the fish program
• Run test1, test2 and test3
MANAGING JOBS: REASONS FOR JOB
FAILURES
• SLURM cannot find the binary file specified in the job script
• You ran out of file storage. It is possible to exceed your filestore allocation limits during a job that is producing large output files. Use the quota command to check this.
• Required input files are missing from the startup directory
• Environment variable is not set correctly (LM_LICENSE_FILE etc)
• Hardware failure
FINDING OUT THE MEMORY REQUIREMENTS OF A JOB
• Real Memory Limits:
• Default real memory allocation is 2 Gbytes
• Request 64GB memory using a batch file
• #SBATCH --mem=64000
• Real memory resource can be requested by using --mem="NN"G
Determining the memory requirements for a job;
• scontrol show jobid –dd <jobid>
MANAGING JOBS : RUNNING CPU-PARALLEL JOBS
• More many processor tasks• Shared memory
• Distributed Memory
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=40
#SBATCH --mem=64000
#SBATCH --mail-user=username@sheffield.ac.ukmodule load apps/openmpi/4.0.1/binary
• Jobs limited to a single node with a maximum of 40 tasks
• Compilers that support MPI.
• PGI , Intel, GNU
DEMONSTRATION 3
• Test 6 provides an opportunity to practice submitting parallel jobs to the
scheduler.
• To run testmpi6, compile the mpi example
• Load the openmpi compiler module
• module load apps/openmpi/4.0.1/binary
• compile the diffuse program
• mpicc diffuse.c -o diffuse -lm
• sbatch testmpi6
• Use squeue to monitor the job examine the output
Running a parallel job
MANAGING JOBS: RUNNING ARRAYS OF JOBS
• Many processors running a copy of a task independently
• Add the –-array parameter to the script file (with
#SBATCH at beginning of the line)
• Example: #SBATCH --array=1-4:1
• This will create 4 tasks from one job
• Each task will have its environment variable
$SLURM_ARRAY_TASK_ID set to a single unique value
ranging from 1 to 10.
• There is no guarantee that task number m will start
before task number n , where m<n
• https://slurm.schedmd.com/job_array.html .
PRACTICE SESSION: SUBMITTING A TASK
ARRAY TO BESSEMER (PROBLEM 4)
• Case Study
• Fish population simulation
• Submitting jobs to Slurm
• Instructions are in the readme file in the Slurm folder of the course examples
• From an interactive session
• Run the Slurm task array example
• Run test4, test5
BEYOND BESSEMER
• Bessemer and ShARC OK for many compute
problems
• Purchasing dedicated resource
• National tier 2 facility for more demanding compute
problems
• Archer Larger facility for grand challenge problems
(pier review process to access)
https://www.sheffield.ac.uk/cics/research/hpc/costs
HIGH PERFORMANCE
COMPUTING TIERS
• Tier 1 computing
• Archer
• Tier 2 Computing
• Peta-5, jade
• Tier 3 Computing
• Bessemer, ShARC
PURCHASING RESOURCE
• Buying nodes using framework
• Research Groups purchase HPC equipment against their research grant this hardware is integrated with Iceberg cluster
• Buying slice of time
• Research groups can purchase servers for a length of time specified by the research group (cost is 1.0p/core per hour)
• Servers are reserved for dedicated usage by the research group using a provided project name
• When reserved nodes are idle they become available to the general short queues. They are quickly released for use by the research group when required.
• For information e-mail research-it@Sheffield.ac.uk
https://www.sheffield.ac.uk/cics/research/hpc/costs
NATIONAL HPC SERVICES
• Tier-2 Facilities• http://www.hpc-uk.ac.uk/• https://goo.gl/j7UvBa
• Archer• UK National Supercomputing Service• Hardware – CRAY XC30
• 2632 Standard nodes• Each node contains two Intel E5-2697 v2 12-core processors• Therefore 2632*2*12 63168 cores.• 64 GB of memory per node• 376 high memory nodes with128GB memory
• Nodes connected to each other via ARIES low latency interconnect• Research Data File System – 7.8PB disk• http://www.archer.ac.uk/
• EPCC• HPCC Facilities
• http://www.epcc.ed.ac.uk/facilities/national-facilities
• Training and expertise in parallel computing
LINKS FOR SOFTWARE
DOWNLOADS
• Moba X-term
https://mobaxterm.mobatek.net/
• Putty
http://www.chiark.greenend.org.uk/~sgtatham/putty/
• WinSCP
http://winscp.net/eng/download.php
• TigerVNC
http://sourceforge.net/projects/tigervnc/