Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

40
Using The Cluster

Transcript of Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Page 1: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Using The Cluster

Page 2: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

What We’ll Be Doing

Add users Run Linpack Compile code Compute Node Management

Page 3: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Add a User

Page 4: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Adding a User Account

Execute:

# useradd <username>

Page 5: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Output from ‘useradd’Creating user: gbmake: Entering directory `/var/411'/opt/rocks/bin/411put --comment="#" /etc/auto.home411 Wrote: /etc/411.d/etc.auto..homeSize: 514/207 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1

/opt/rocks/bin/411put --comment="#" /etc/passwd411 Wrote: /etc/411.d/etc.passwdSize: 2565/1722 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1

/opt/rocks/bin/411put --comment="#" /etc/shadow411 Wrote: /etc/411.d/etc.shadowSize: 1714/1093 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1

/opt/rocks/bin/411put --comment="#" /etc/group411 Wrote: /etc/411.d/etc.groupSize: 1163/687 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1

make: Leaving directory `/var/411'

Page 6: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

411 Secure Information Service

Secure NIS replacement

Distributes files within the cluster Default 411 configuration is to distribute user account

files, but one can use 411 to distribute any file to all nodes

Page 7: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

411 Secure Information Service

When a 411 monitored file changes, an alert is multicast When a node receives an alert, it pulls the file associated with

the alert

Compute nodes periodically pull all files under the control of 411

Page 8: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

User Accounts

All user accounts are housed on the frontend under: /export/home/<username>

All nodes use ‘autofs’ to automatically mount the user directory when a user logs into a node This method provides for a simple global file system

On the frontend and every compute node, the user account is available at “/home/<username>”

Page 9: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Deleting a User

Use:

# userdel <username>

Note: the user’s home directory (/export/home/<username>) will not be removed For safety, this must be removed by hand

Page 10: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Running Linpack

Page 11: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Linpack Linpack is a floating point matrix multiply benchmark Measures sustained floating-point operations per second

“Giga flops” - 1 billion floating point operations per second

This benchmark is used to rate the Top500 fastest supercomputers in the world

We use it as a comprehensive test of the system Stresses the CPU Uses the MPICH layer Sends a modest number of messages Ensures a user can launch a job on all nodes Can run through queueing system to also test queueing

system

Page 12: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Running Linpack From the Command Line

Make a ‘machines’ file Execute: vi machines Input the following:

compute-0-0compute-0-0

Get a test Linpack configuration file:

$ cp /var/www/html/rocks-documentation/3.2.0/examples/HPL.dat .

# su - <userid>

Login as non-root user

Page 13: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Run It Load your ssh key into your environment:

$ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 \-machinefile machines /opt/hpl/gnu/bin/xhpl

$ ssh-agent $SHELL$ ssh-add

Execute Linpack:

Flags: -nolocal : don’t run Linpack on host that is launching the job -np 2 : give the job 2 processors -machinefile : run the job on the nodes specified in the file ‘machines’

Page 14: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Successful Linpack OutputThe following parameter values will be used:

N : 2000 NB : 64 P : 1 Q : 2 PFACT : Left Crout Right NBMIN : 8 NDIV : 2 RFACT : Right BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 80)L1 : transposed formU : transposed formEQUIL : yesALIGN : 8 double precision words

----------------------------------------------------------------------------

- The matrix A is randomly generated for each test.- The following scaled residual checks will be computed: 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )- The relative machine precision (eps) is taken to be 1.110223e-16- Computational tests pass if scaled residuals are less than 16.0

============================================================================T/V N NB P Q Time Gflops----------------------------------------------------------------------------W11R2L8 2000 64 1 2 1.96 2.724e+00----------------------------------------------------------------------------||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.1049227 ...... PASSED||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0255037 ...... PASSED||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0055411 ...... PASSED

Page 15: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Running Linpack Through a Job Management System Get a test SGE submission script:

$ cp /var/www/html/rocks-documentation/3.2.0/examples/sge-qsub-test.sh .

Examine the script Most of the script concerns adding (and removing) a

temporary ssh key to your environment

Page 16: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

/opt/mpich/gnu/bin/mpirun -nolocal -np $NSLOTS \-machinefile $TMPDIR/machines \/opt/hpl/gnu/bin/xhpl

Important Part Of The Script

At the top Requested number of processors

In the middle What job to run

#$ -pe mpi 2

Page 17: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Submit the Job

Send the job off to SGE:

$ qsub sge-qsub-test.sh

Page 18: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Monitoring the Job

Command line$ qstat -f

queuename qtype used/tot. load_avg arch states----------------------------------------------------------------------------compute-0-0q BIP 2/2 99.99 glinux 3 0 sge-qsub-t bruno r 06/03/2004 02:48:15 MASTER 0 sge-qsub-t bruno r 06/03/2004 02:48:15 SLAVE

Status

Page 19: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Job Output

SGE writes 4 files: sge-qsub-test.sh.e0

Stderr for job ‘0’ sge-qsub-test.sh.o0

Stdout for job ‘0’ sge-qsub-test.sh.pe0

Stderr from the queueing system regarding job ‘0’ sge-qsub-test.sh.po0

Stdout from the queueing system regarding job ‘0’

Page 20: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Removing a Job from the Queue

Execute:

$ qdel <job id>

queuename qtype used/tot. load_avg arch states----------------------------------------------------------------------------compute-0-0q BIP 2/2 99.99 glinux 3 0 sge-qsub-t bruno r 06/03/2004 02:48:15 MASTER 0 sge-qsub-t bruno r 06/03/2004 02:48:15 SLAVE

Find the job id with ‘qstat -f’

To remove the job above:$ qdel 3

Page 21: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Monitoring SGE Via The Web

Setup access to web server Local access

Configure X: redhat-config-xfree86 Remote access

Open http port in “/etc/sysconfig/iptables” Or, port forwarding

“ssh [email protected] -L 8080:localhost:80”

Then point web browser to “http://localhost:8080”

Page 22: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Frontend Web Page

Page 23: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

SGE Job Monitoring

Page 24: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

SGE Job Monitoring

Page 25: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Ganglia Monitoring

Page 26: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Ganglia Monitoring

Page 27: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Scaling Up Linpack

Tell SGE to allocate more processors Edit ‘sge-qsub-test.sh’ and change:

#$ -pe mpi 2

To:

#$ -pe mpi 4

Tell Linpack to use more processors Edit ‘HPL.dat’ and change

1 Ps

To:

2 Ps

The number of processors Linpack uses is P * Q

Page 28: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Scaling Up Linpack

Submit the larger job$ qsub sge-qsub-test.sh

To make Linpack use more memory (and increase performance, edit ‘HPL.dat’ and change

1000 Ns

To:

4000 Ns

Linpack operates on an N * N matrix Goal: consume 75% of memory on each compute node

Page 29: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Using Linpack Over Myrinet

Scale up the job in the same manner as described in the previous slides.

Submit the Myrinet-based job$ qsub sge-qsub-test-myri.sh

Get a test Myrinet SGE submission script:

$ cp /var/www/html/rocks-documentation/3.2.0/examples/sge-qsub-test-myri.sh .

Page 30: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Executing Commands Across the Cluster Collect “ps” status

cluster-ps <regular expression> To get the status of all the processes being executed by user ‘bruno’

Execute: cluster-ps bruno

Kill processes cluster-kill <regular expression> To kill all the Linpack jobs

Execute: cluster-kill xhpl

Execute any command line executable cluster-fork <regular expression> To restart the ‘autofs’ service on all compute nodes

Execute: cluster-fork “service autofs restart”

Page 31: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Executing Commands Across the Cluster All cluster-* commands can query the database to

generate a node list

To restart the ‘autofs’ service only on the nodes in cabinet 1

Execute: cluster-fork --query=“select name from nodes where rack=1” “service autofs restart”

Page 32: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Compile Code

Page 33: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Compile Test MPI Program with gcc Compile cpi

$ cp /opt/mpich/gnu/examples/cpi.c .$ cp /opt/mpich/gnu/examples/Makefile .$ make cpi/opt/mpich/gnu/bin/mpicc -c cpi.c/opt/mpich/gnu/bin/mpicc -o cpi cpi.o -lm

Run it$ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 -machinefile machines $HOME/cpi/cpiProcess 0 on compute-2-1.localpi is approximately 3.1416009869231241, Error is 0.0000083333333309wall clock time = 0.000650Process 1 on compute-2-1.local

Page 34: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Compile Test MPI Program with gcc Compile cpi

$ cp /opt/mpich/gnu/examples/cpi.c $HOME$ cp /opt/mpich/gnu/examples/Makefile $HOME$ make cpi/opt/mpich/gnu/bin/mpicc -c cpi.c/opt/mpich/gnu/bin/mpicc -o cpi cpi.o -lm

Run it$ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 -machinefile machines $HOME/cpiProcess 0 on compute-2-1.localpi is approximately 3.1416009869231241, Error is 0.0000083333333309wall clock time = 0.000650Process 1 on compute-2-1.local

Page 35: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Compile MPI Code with Intel Compiler

Simply change ‘gnu’ to ‘intel’

$ cp /opt/mpich/intel/examples/cpi.c $HOME$ cp /opt/mpich/intel/examples/Makefile $HOME$ make cpi/opt/mpich/intel/bin/mpicc -c cpi.c/opt/mpich/intel/bin/mpicc -o cpi cpi.o -lm

Page 36: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Bring In Your Own Code

FTP your code to the frontend Let’s compile and try to run it!

Page 37: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Compute Node Management

Page 38: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Adding a Compute Node

Execute “insert-ethers” If adding to a specific rack:

For example, if adding to cabinet 2: “insert-ethers --cabinet=2”

If adding to a specific location within a rack: “insert-ethers --cabinet=2 rank=4”

Page 39: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Replacing a Dead Node

To replace node compute-0-4:

# insert-ethers --replace=“compute-0-4”

Remove the dead node Power up the new node Put the new node into “installation mode”

Boot with Rocks Base CD, PXE boot, etc.

The next node that issues a DHCP request will assume the role of compute-0-4

Page 40: Using The Cluster. What We’ll Be Doing Add users Run Linpack Compile code Compute Node Management.

Removing a Node

If decommissioning a node:

# insert-ethers --remove=“compute-0-2”

Insert-ethers will remove all traces of compute-0-2 from the database and restart all relevant services You will not be asked for any input