High Performance Computing

22
HIGH PERFORMANCE COMPUTING MPI and C-Language Seminars 2010

description

MPI and C-Language Seminars 2010. High Performance Computing. Seminar Plan. Week 1 – Introduction, Data Types, Control Flow, Pointers Week 2 – Arrays, Structures, Enums , I/O, Memory Week 3 – Compiler Options and Debugging Week 4 – MPI in C and Using the HPSG Cluster - PowerPoint PPT Presentation

Transcript of High Performance Computing

Page 1: High Performance Computing

HIGH PERFORMANCE

COMPUTING

MPI and C-Language Seminars 2010

Page 2: High Performance Computing

Seminar Plan Week 1 – Introduction, Data Types, Control Flow, Pointers Week 2 – Arrays, Structures, Enums, I/O, Memory Week 3 – Compiler Options and Debugging

Week 4 – MPI in C and Using the HPSG Cluster

Week 5 – “How to Build a Performance Model”

Week 6-9 – Coursework Troubleshooting(Seminar tutors available in their office)

Page 3: High Performance Computing

MPI in C

Page 4: High Performance Computing

Introduction to MPI MPI – Message passing interface, is an extension C to allow

processors to communicate with each other.

No need for a shared memory space – All data passes via messages.

Every processor can send to every other processor but data must explicitly be received.

Processors are kept synchronised by barriers.

Page 5: High Performance Computing

MPI Hello World (1/2) The most basic of MPI programs:

#include <stdio.h>#include <mpi.h>

int main(int argc, char *argv[])int rank, size;

MPI_Init(&argc, &argv); /* starts MPI */MPI_Comm_rank(MPI_COMM_WORLD, &rank); /* get current process id */MPI_Comm_size (MPI_COMM_WORLD, &size); /* get processor count*/

printf( "Hello world from process %d of %d\n", rank, size );

MPI_Finalize();return 0;

}

Page 6: High Performance Computing

MPI Hello World (2/2) The MPI environment is established via the MPI_Init call.

MPI_COMM_WORLD Is the default communicator. Defined as a group of processors

MPI_Comm_size Is the number of processors in that communicator, for MPI_COMM_WORLD this represents all the processors.

MPI_Comm_rank Is the position of that processor within the communicator provided.

Page 7: High Performance Computing

Compiling MPI MPI has multiple different compilers for implementations in

different languages we only need the C compiler.mpicc – C based compiler – For us GCCmpiCC / mpicxx / mpic++ – C++ basedmpif90 / mpif77 – Fortran based

Compiling is done in the same way as C.mpicc –o myprogram helloworld.c

Page 8: High Performance Computing

Running MPI Once compiled an MPI program must be run with mpirun.

mpirun –np 2 myprogram Where 2 is the number of processors to run on.

As there is no synchronisation in the program the order of the print statements is non deterministic.

Note: Killing MPI jobs without letting them call MPI_Finalize may result in stray threads.

Page 9: High Performance Computing

Environment Variables MPI and GCC are installed remotely and their paths need

to be added to your environment variables.

The Module package allows you to quickly load and unload working environments.

Module is installed on the cluster(Deep Thought) ‘module avail’ – List all available modules. ‘module load gnu/openmpi’ – Loads gcc-4.3 and openmpi. ‘module list’ – Shows currently loaded modules. ‘module unload gnu/openmpi’ – Unloads the module.

Page 10: High Performance Computing

Message Passing in MPI

Page 11: High Performance Computing

MPI_Send MPI_Send - Basic method of passing data.

Each MPI_Send must have a matching MPI_Recv.

MPI_Send(message, length, data type, destination, tag, communicator);Message – Actual data in the form of a pointer.Length – Number of elements in the message.Data Type – The MPI Data type of each element in the message.Destination – Rank of the processor to receive the message.Tag – Identifier for when sending multiple messages.Communicator – Processor group (MPI_COMM_WORLD).

Page 12: High Performance Computing

MPI_Recv Required for MPI_Send.

MPI_Recv(message, length, data type, source, tag, communicator, status);Message – Pointer to memory address to store the data.Length – Number of elements in the message.Data Type – The MPI Data type of each element in the message.Source – Rank of the processor to sending the message.Tag – Identifier for when sending multiple messages.Communicator – Processor group (MPI_COMM_WORLD).Status – A structure to hold the status of the send/recv.

Page 13: High Performance Computing

Message Passing Example

Processors sending data from process 0 to 1.int size, rank, tag=0;int myarray[3];MPI_Status status;MPI_Init(&argc, &argv);MPI_Comm_rank(MPI_COMM_WORLD, &rank);MPI_Comm_size(MPI_COMM_WORLD, &size);

if(rank == 0){myarray[0] = 1; myarray[1] = 2; myarray[2] = 3;MPI_Send(myarray, 3, MPI_INT, 1, tag, MPI_COMM_WORLD);}else{MPI_Recv(myarray, 3, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);}MPI_Finalize();

Page 14: High Performance Computing

Process Synchronisation Need to ensure that all processes are at the same point of

execution.

Implicit and explicit definitions;Barriers or blocking communications.

MPI_Barrier(MPI_COMM_WORLD);Waits for all processors before any continue.

MPI_Send / MPI_RecvWait for the other process to finish receiving before continuing.

Page 15: High Performance Computing

Non-Blocking Communication MPI_Isend / MPI_Irecv instead of MPI_Send / MPI_Recv.

‘I’ stands for immediately – The calling process returns immediately regardless of the status of the actual operation.

MPI_Isend – Allows you to continue processing while the send happens.

MPI_Irecv – You must check the data has arrived before using it.

Page 16: High Performance Computing

Accessing the Cluster

Page 17: High Performance Computing

Deepthought – IBM Cluster

42 nodes.2 Cores per node (Pentium III – 1.4Ghz).2GB RAM per node.Myrinet fibre-optics interconnect.

ssh [email protected]

scp ./karman.tar.gz [email protected]:/path/

Headnode ( Frankie ) – Not to be used for running jobs.All MPI jobs on Frankie will be killed

Page 18: High Performance Computing

PBS (1/3) We use Torque(OpenPBS) and MAUI (Scheduler) .

Listing jobs in the queue:fjp@frankie:~$ qstat –afrankie:                                                                 Req'd  Req'd  ElapJob ID  Username Queue Jobname  SessID NDS  TSK Memory Time  S Time------- ------- -------- ---------------- ------ ----- --- ------ ----- - -----27613.frankie  sdh   hpsg  octave   11363   1  --    --  3000: R 68:4427614.frankie  sdh   hpsg  octave   11434   1  --    --  3000: R 68:41

Status Flags:Q – Queued.R – Running.E – Ending (Staging out of files) – NOT Error!!!!C – Complete.

Page 19: High Performance Computing

PBS (2/3) Submitting a Job:

From file:qsub –V –N <name> -l nodes=x:ppn=y submit.pbs

An interactive Job:qsub –V –N <name> –l nodes=x:ppn=y -I

Submit Files:#!/bin/bash#PBS –Vcd $PBS_O_WORKDIRmpirun ./myprog

Deleting a job:qdel <jobid>

Page 20: High Performance Computing

PBS (3/3) Node information:

fjp@frankie:~$ pbsnodes –avogon0.deepthought.hpsg.dcs.warwick.ac.ukstate = job-exclusivenp = 2properties = vogonntype = clusterjobs = 0/27613.frankie, 1/27614.frankiestatus = .........

Standard Output and Error.For interactive jobs is as normal.Batch Jobs :

○ Output File - <jobname / submit file name>.o<jobid>○ Error File - <jobname / submit file name>.e<jobid>

File I/O takes place as usual.Concurrent file writes (same file) can be problematic – avoid.

Page 21: High Performance Computing

Queues Different queues are specified to have access to different

resources with different priorities.

Debug queue – High priority low core count(~4) – need to use: qsub -q debug ....

Interactive queue – High priority medium core count(~8) - no need to specify a queue.

Batch queue – Normal priority high core count(~64).

Page 22: High Performance Computing

Warning Shared resource - Don’t leave it until the last minute.

The queue can get very busy.

Don’t leave interactive jobs running when not in use.

Once again – Do not run jobs on Frankie!