Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432...

23
Introduction to HPC2N Birgitte Brydsø HPC2N, Ume˚ a University 3 December 2019 1 / 23

Transcript of Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432...

Page 1: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Introduction to HPC2N

Birgitte Brydsø

HPC2N, Ume̊a University

3 December 2019

1 / 23

Page 2: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Kebnekaise

1 602 nodes / 19288 cores (of which 2448 are KNL)

432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node52 Intel Xeon Gold 6132, 2x14 cores, 192 GB/node20 Intel Xeon E7-8860v4, 4x18 cores, 3072 GB/node32 Intel Xeon E5-2690v4, 2x NVidia K80, 2x14, 2x4992, 128 GB/node4 Intel Xeon E5-2690v4, 4x NVidia K80, 2x14, 4x4992, 128 GB/node10 Intel Xeon Gold 6132, 2x NVidia V100, 2x14, 2x5120, 192 GB/node

36 Intel Xeon Phi 7250, 68 cores, 192 GB/node, 16 GB MCDRAM/node

2 501760 CUDA “cores” (80*4992 cores/K80+20*5120 cores/V100)

3 More than 136 TB memory

4 Interconnect: Mellanox FDR / EDR Infiniband

5 Theoretical performance: 728 TF (+ expansion)

6 Date installed: Fall 2016 / Spring 2017 / Spring 2018

2 / 23

Page 3: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Using KebnekaiseConnecting to HPC2N’s systems

Linux, Windows, MacOS/OS X: Install thinlinc client

Linux, OS X:ssh [email protected]

Use ssh -Y .... if you want to open graphical displays.

Windows:Get SSH client (MobaXterm, PuTTY, Cygwin ...)Get X11 server if you need graphical displays (Xming, ...)Start the client and login with your HPC2N username to

kebnekaise.hpc2n.umu.se

More information here:

https://www.hpc2n.umu.se/documentation/guides/windows-connection

Mac/OSX: Guide here:https://www.hpc2n.umu.se/documentation/guides/mac-connection

3 / 23

Page 4: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Using KebnekaiseConnecting with thinlinc

Download and install the client fromhttps://www.cendio.com/thinlinc/download

Start the client. Enter the name of the server:kebnekaise-tl.hpc2n.umu.se and then enter your ownusername under ”Username”. Enter your Password.

Go to ”Options” -> ”Security” and check that authenticationmethod is set to password.

Go to ”Options” -> ”Screen” and uncheck ”Full screenmode”.

Click ”Connect”. Click ”Continue” when you are being toldthat the server’s host key is not in the registry.

After a short time, the thinlinc desktop opens, running Matewhich is fairly similar to the Gnome desktop. All your files onHPC2N should be available.

4 / 23

Page 5: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Using KebnekaiseTransfer your files and data

Linux, OS X:Use scp (or sftp) for file transfer. Example, scp:

local> scp [email protected]:file .

local> scp file [email protected]:file

Windows:Download client: WinSCP, FileZilla (sftp), PSCP/PSFTP, ...Transfer with sftp or scp

Mac/OSX:Transfer with sftp or scp (as for Linux) using TerminalOr download client: Cyberduck, Fetch, ...

More information in guides (see previous slide) and here:https://www.hpc2n.umu.se/documentation/filesystems/filetransfer

5 / 23

Page 6: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Using KebnekaiseEditors

Editing your files

Various editors: vi, vim, nano, emacs ...

Example, vi/vim:

vi <filename>Insert before: iSave and exit vi/vim: Esc :wq

Example, nano:

nano <filename>Save and exit nano: Ctrl-x

Example, Emacs:

Start with: emacsOpen (or create) file: Ctrl-x Ctrl-fSave: Ctrl-x Ctrl-sExit Emacs: Ctrl-x Ctrl-c

6 / 23

Page 7: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The File System

AFSYour home directory is here($HOME)Regularly backed upNOT accessible by the batch system(ticket-forwarding doesn’t work)

secure authentification with

Kerberos tickets

PFSParallel File SystemNO BACKUPHigh performance when accessedfrom the nodesAccessible by the batch systemCreate symbolic link from $HOMEto pfs:

ln -s /pfs/nobackup/$HOME

$HOME/pfs

7 / 23

Page 8: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Module System (Lmod)

Most programs are accessed by first loading them as a ’module’

Modules are:

used to set up your environment (paths to executables,libraries, etc.) for using a particular (set of) softwarepackage(s)

a tool to help users manage their Unix/Linux shellenvironment, allowing groups of related environment-variablesettings to be made or removed dynamically

allows having multiple versions of a program or packageavailable by just loading the proper module

installed in a hierarchial layout. This means that somemodules are only available after loading a specific compilerand/or MPI version.

8 / 23

Page 9: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Module System (Lmod)

Most programs are accessed by first loading their ’module’

See which modules exists:module spider or ml spider

Modules depending only on what is currently loaded:module avail or ml av

See which modules are currently loaded:module list or ml

Example: loading a compiler toolchain and version, here for GCC,OpenMPI, OpenBLAS/LAPACK, FFTW, ScaLAPACK and CUDA:module load fosscuda/2019a or ml fosscuda/2019a

Example: Unload the above module:module unload fosscuda/2019a or ml -fosscuda/2019a

More information about a module:module show <module> or ml show <module>

Unload all modules except the ’sticky’ modules:

module purge or ml purge

9 / 23

Page 10: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Module SystemCompiler Toolchains

Compiler toolchains load bundles of software making up a complete envi-

ronment for compiling/using a specific prebuilt software. Includes some/all

of: compiler suite, MPI, BLAS, LAPACK, ScaLapack, FFTW, CUDA.

Some of the currently available toolchains (check ml av for all/versions):

GCC: GCC onlygcccuda: GCC and CUDAfoss: GCC, OpenMPI, OpenBLAS/LAPACK, FFTW, ScaLAPACKfosscuda: GCC, OpenMPI, OpenBLAS/LAPACK, FFTW, ScaLAPACK, and CUDAgimkl: GCC, IntelMPI, IntelMKLgimpi: GCC, IntelMPIgompi: GCC, OpenMPIgompic: GCC, OpenMPI, CUDAgoolfc: gompic, OpenBLAS/LAPACK, FFTW, ScaLAPACKicc: Intel C and C++ onlyiccifort: icc, iforticcifortcuda: icc, ifort, CUDAifort: Intel Fortran compiler onlyiimpi: icc, ifort, IntelMPIintel: icc, ifort, IntelMPI, IntelMKLintelcuda: intel and CUDAiomkl: icc, ifort, Intel MKL, OpenMPIpomkl: PGI C, C++, and Fortran compilers, IntelMPIpompi: PGI C, C++, and Fortran compilers, OpenMPI

10 / 23

Page 11: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Compiling and Linking with LibrariesLinking

Figuring out how to link

Intel and Intel MKL linking:https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor

Buildenv

After loading a compiler toolchain, load ’buildenv’ and use’ml show buildenv’ to get useful linking infoExample, fosscuda, version 2019a:ml fosscuda/2019a

ml buildenv

ml show buildenv

Using the environment variable (prefaced with $) is highlyrecommended!You have to load the buildenv module in order to be able touse the environment variables for linking!

11 / 23

Page 12: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Batch System (SLURM)

Large/long/parallel jobs must be run through the batchsystem

SLURM is an Open Source job scheduler, which providesthree key functions

Keeps track of available system resourcesEnforces local system resource usage and job schedulingpoliciesManages a job queue, distributing work across resourcesaccording to policies

In order to run a batch job, you need to create and submit aSLURM submit file (also called a batch submit file, a batchscript, or a job script).

Guides and documentation at:http://www.hpc2n.umu.se/support

12 / 23

Page 13: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Batch System (SLURM)Useful Commands

Submit job: sbatch <jobscript>

Get list of your jobs: squeue -u <username>

srun <commands for your job/program>

salloc <commands to the batch system>

Check on a specific job: scontrol show job <job id>

Delete a specific job: scancel <job id>

Useful info about job: sacct -l -j <jobid> | less -S

13 / 23

Page 14: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Batch System (SLURM)Job Output

Output and errors in:slurm-<job-id>.out

To get output and error files split up, you can give these flagsin the submit script:#SBATCH --error=job.%J.err

#SBATCH --output=job.%J.out

To specify Broadwell or Skylake only:#SBATCH --constraint=broadwell or#SBATCH --constraint=skylake

To run on the GPU nodes, add this to your script:#SBATCH --gres=gpu:<card>:xwhere <card> is k80 or v100, x = 1, 2, or 4 (4 only if K80).

http://www.hpc2n.umu.se/resources/hardware/kebnekaise

14 / 23

Page 15: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Batch System (SLURM)Simple example, serial

Example: Serial job, compiler toolchain ’fosscuda/2019a’

#!/bin/bash

# Project id - change to your own after the course!

#SBATCH -A SNIC2019-5-156

# Asking for 1 core

#SBATCH -n 1

# Asking for a walltime of 5 min

#SBATCH --time=00:05:00

# Always purge modules before loading new in a script.

ml purge > /dev/null 2>&1ml fosscuda/2019a

./my serial program

Submit with:

sbatch <jobscript>15 / 23

Page 16: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Batch System (SLURM)parallel example

#!/bin/bash

#SBATCH -A SNIC2019-5-156

#SBATCH -n 14

#SBATCH --time=00:05:00

ml purge < /dev/null 2>&1ml fosscuda/2019a

srun ./my mpi program

16 / 23

Page 17: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

The Batch System (SLURM)Requesting GPU nodes

Currently there is no separate queue for the GPU nodes

Request GPU nodes by adding this to your batch script:

#SBATCH --gres=gpu:<type-of-card>:x

where <type-of-card> is either k80 or v100 and x

= 1, 2, or 4 (4 only for the K80 type)

There are 32 nodes (broadwell) with dual K80 cards and 4nodes with quad K80 cards

There are 10 nodes (skylake) with dual V100 cards

17 / 23

Page 18: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

R at HPC2NLoading R

Check which version of R is installed:ml spider R

Choose the version you want. We recommendR/3.4.4-X11-20180131

Load the necessary prerequisites as well as the moduleml GCC/6.4.0-2.28 OpenMPI/2.1.2

R/3.4.4-X11-20180131

You can now run R, or install any R packages you wish.

On our website you can see how to find out which R packagesare already installedhttps ://www .hpc2n.umu.se/resources/software/r#HPC2N R addons

18 / 23

Page 19: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

R at HPC2NInstalling R packages/add-ons

Create a place for the R add-ons and tell R to find it. Here weuse /pfs/nobackup$HOME/R-packages:mkdir -p /pfs/nobackup$HOME/R-packages

R reads the $HOME/.Renviron file to setup its environment.Since you want to use R from the batch system, we need tocreate a link to the directory in pfs:ln -s /pfs/nobackup$HOME/.Renviron $HOME

Since the file likely is empty now, tell R where your add-ondirectory is like this:echo R LIBS="/pfs/nobackup$HOME/R-packages" >~/.Renviron

If it is not empty, edit $HOME/.Renvion so that R LIBScontain the path to your chosen add-on directory. It shouldlook something like this when you are done:R LIBS="/pfs/nobackup/home/u/user/R-packages"

19 / 23

Page 20: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

R at HPC2NInstalling R packages/add-ons automatically

Automatic download and install

Load R and dependenciesInstall from CRAN repo (in Sweden)

R --quiet --no-save --no-restore -e

"install.packages(’package’,

repos=’http://ftp.acc.umu.se/mirror/CRAN/’)"

If the package has dependencies that come from more thanone repo it will not work. You either run the ”install.packages”interactively in R or by manual method.

You can now use your add-on like thislibrary("package")

20 / 23

Page 21: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

R at HPC2NInstalling R packages/add-ons manually

Manual download and install

Download (wget for instance) the add-on from the CRANPackage site. Download and install any prerequisites firstLoad R and dependenciesTell R to install into your chosen add-on directory

R CMD INSTALL -l /pfs/nobackup$HOME/R-packages

R-package.tar.gz

You can now use your add-on like thislibrary("package")

21 / 23

Page 22: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

R at HPC2NRStudio

RStudio is only installed on the thinlinc node, so you need toconnect to that first with your thinlinc client

Start RStudio with rstudio

Note that you cannot submit jobs to the batch system frominside RStudio! Anything run from inside it will run directlyon the thinlinc node!

22 / 23

Page 23: Introduction to HPC2N€¦ · Kebnekaise 1 602 nodes / 19288 cores (of which 2448 are KNL) 432 Intel Xeon E5-2690v4, 2x14 cores, 128 GB/node 52 Intel Xeon Gold 6132, 2x14 cores, 192

Various useful info

A project has been set up for the workshop: SNIC2019-5-156

You use it in your batch submit file by adding:

#SBATCH -A SNIC2019-5-156

There is a reservation for 2 regular Broadwell nodes. Thisreservation is accessed by adding this to your batch submitfile:

#SBATCH --reservation=ml-with-r

The reservation is ONLY valid for the duration of the course.

23 / 23