MPI

19
MPI Introduction to MPI Commands

description

MPI. Introduction to MPI Commands. Basics – Send and Receive. MPI is a message passing environment. The processors’ method of sharing information is NOT via shared memory, but by processors sending messages to each other - PowerPoint PPT Presentation

Transcript of MPI

Page 1: MPI

MPI

Introduction to MPI Commands

Page 2: MPI

Basics – Send and Receive

• MPI is a message passing environment. The processors’ method of sharing information is NOT via shared memory, but by processors sending messages to each other

• This is done via a send-receive pairing. The originating processor can send anytime it wants to, but the destination processor has to do a receive before it gets to the destination

Page 3: MPI

Send Function - Form

• void MPI_Send(buf, count, datatype, dest, tag, MPI_COMM_WORLD)

• buf – the name of the variable to be sent• Count – how many to send• Datatype – the type of what is being sent• Dest – where to send it• Tag – message type• COMM_WORLD – communicator – info about the

parallel system

Page 4: MPI

Send Arguments Discussion

• buf – the address of the information to send – can be any data type.

• datatype – must be a data type defined in MPI (ex. MPI_INT, MPI_FLOAT, MPI_DOUBLE, etc.). The user can create data types and “register” them with MPI (later).

• Count – how many values of type datatype are to be sent starting from the address buf (not the byte size of buf)

Page 5: MPI

Send Args Discussion (cont.)• Destination – which process to send the message

to. Type – int• Tag – indicator about what kind of message is

being sent. Programmer determined. Allows a process to send a variety of types of messages. Type - int

• MPI_COMM_WORLD – communicator – information about the parallel system configuration to map destination (int) to a particular processor. There will be ways to change and/or create new communicators (later), for example to partition the system into groups of processors doing independent work.

Page 6: MPI

More Discussion and Notes• It is more efficient to send a few big blocks of

data than it is to send many small blocks of data (message sending overhead).

• MPI uses an MPI defined data type so that communication between heterogeneous machines is possible.– Data being sent should be declared with an MPI

defined type• MPI has MANY constants to indicate certain

values (for example, MPI_INT may be 3). Get to know these constants.

Page 7: MPI

Discussion and notes (cont.)

• This send is a blocked send. The next instructions in the program will NOT be executed until the send is done (the data is sent to the system, does NOT wait until the data has been received).

Page 8: MPI

Receive• MPI_Recv(buf, count, datatype, source, tag,

status,MPI_COMM_WORLD)• Buf – where to put the message• Count – how many• Datatype – an mpi type for the count items in buf• Source – accept the message from this process (can be

a wildcard for any process).• Tag- which type of message to accept (can be a

wildcard for any type)• Status – optional, contains the source and tag for use if

the tag and/or source args were wildcards.

Page 9: MPI

Minimal MPI

• Each MPI program needs the following 6:– MPI_Init(&argc, &argv) – initialize MPI – set up

the MPI_COMM_WORLD communicator– int MPI_Comm_size(MPI_COMM_WORLD, &p) –

Number of processes into p.– int MPI_Comm_rank(MPI_COMM_WORLD,&rank)

– which process am I?– Send– Recv– MPI_Finalize() – Terminate MPI

Page 10: MPI

MPI Philosopy

• One program for all processes– Starts with init– Get my process number• Process 0 is usually the “Master” node (One process to

bind them all – apologies to J.R.R. Tolkien.)

– Big if/else statement to do master stuff verses slave stuff.• Master could also do some slave stuff

– Load balancing issues

Page 11: MPI

C MPI at WU on Herot

• #include “mpi.h”• int main(int argc, char *argv[])• MPI_Init(&argc, &argv)– Typically –np # to set up COMM_WORLD

• mpicc - to compile mpi programs• mpirun –np # executable

Page 12: MPI

Bcast

• MPI_Bcast(buf, count, datatype, root, MPI_COMM_WORLD)– EVERY PROCESS executes this function. It is BOTH

a send and receive.– Root is the “sender”, all other processes are

receivers.

Page 13: MPI

Reduce

• MPI_Reduce(sendbuf, recvbuf, count, datatype, op, root, MPI_COMM_WORLD)

• Executed by ALL processes (somewhat of a send and receive).

• EVERYONE sends sendbuf where op is performed on all those items and the answer appears in recvbuf of process root.

• Op is specified by one of many constants (ex. MPI_SUM, MPI_PROD, MPI_MAX, MPI_MIN)

Page 14: MPI

Timing MPI Programs

• double MPI_Wtime()– Time in seconds since some arbitrary point in time– Call twice, once at beginning, once at end of code

to time– Difference is elapsed time

• double MPI_Wtick()– Granularity, in seconds, of MPI Wtime function

Page 15: MPI

Receive revisited

• Recall– MPI_Recv(buf, count, datatype, source, tag,

status,MPI_COMM_WORLD)– Source and/or tag could be a wildcard

(MPI_ANY_TAG, MPI_ANY_SOURCE)– Status type MPI_Status• status.MPI_SOURCE• status.MPI_TAG• status.MPI_ERROR

Page 16: MPI

Send/Receive Issues – Deadlock

• One necessary condition for deadlock is mutual (cyclic) waiting– Process 0 does a send to p1 and then receive from

p1– Process 1 does a send to p0 and then receive from

p0– If there are no (or too small buffers) the p0 send

will wait until the receive occurs on p1, but the p1 send has to wait for p0’s receive to do p1’s receive

Page 17: MPI

More Deadlock• Doing– P0 sends to p1 then receives from p1 and p1 receiving

from p0 then sending to p0 will not deadlock.

• Ring solution– If we have a ring network and we want each

processor to send its value to the “next” processor, you might have everyone do a send then a receive – could cause deadlock

– Have even processors do send then receive, and odd processors do receive then send

Page 18: MPI

Sendrecv• MPI_Sendrecv(sendBuf, sendCount, sendType,

dest, sendTag, recBuf, recCount, recType,source, recTag, comm, status)

• No need to worry about send/receive order. No deadlock

• Good when every node gets someone else’s data (data shift)

• If using same type, can use– MPI_Sendrecv_replace(buf,count,type,dest,sTag,

source,rTag, com,status)

Page 19: MPI

Non Blocking• MPI_Isend(buf, count, type, dest, tag, com,

request)• MPI_Irecv(….same…)• int MPI_Test(request, flag, status)– Returns flag=1 if the operation associated with

request is done, 0 if not– Status filled if flag=1

• MPI_Wait(request, status)– Blocks until operation with request is done