Mesage-Passing Parallel Programming with MPI

1

Mesage-Passing Parallel Programming with MPI

2

What is MPI?

MPI (Message-Passing Interface) is a message-passing library specification that can be used to write parallel programs for parallel computers, clusters, and heterogeneous networks.

Portability across platforms is a crucial advantage of MPI. Not only can MPI programs run on any distributed-memory machine and multicomputer, but they can also run efficiently on shared-memory machines. OpenMP programs can only run efficiently on machines with hardware support for shared memory.

Like OpenMP, MPI was designed with the participation of several computer vendors (IBM, Intel, Cray, Convex, Meiko, Ncube) and software houses (KAI, ParaSoft). Furthermore, several Universities participated in the design of MPI.

3

e exchange of

.

r’s memory is

w.mcs.anl.gov/mpi ).

ss 1

ata)

Cooperative operationsMessage-passing is an approach that makes thdata cooperative.

Data mush both be explicitly sent and received

An advantage is that any change in the receivemade with the receiver’s participation.

(From W. Gropp’s transparencies: Tutorial on MPI. http://ww

Process 0 Proce

send(data)

recv(d

4

ses include

hout waiting for


ss 1

ata)

ss 1

ory)

One-sided operations

One-sided operations between parallel procesremote memory reads and writes.

An advantage is that data can be accesses witanother process.


Process 0 Proce

(Memory)

get(d

Process 0 Proce

put(data)

(Mem

5

s that are easy cessor using n.

Comparison

One-sided operations tend to produce programto read. With one sided operations only the prothe data has to participate in the communicatio

6

the statement a and c are in

with a global ecause the ed to

• Example: The following code would executea=f(b,c) in processor 1 of a multicomputer ifthe memory of processor 2.

in processor 1: receive (2,x)y = f(b,x)send (2,y)

in processor 2: send (1,c) receive(1,a)

In a shared-memory machine or a machine address space, the code would be simpler bprogram running on processor 2 does not neparticipate in the computation:

in processor 1: x := get(2,c) y = f(b,x) put(2,a) := y

7

• Example: To execute the loop do i=1,n

a(i) = x(k(i)) end do

in parallel on a message passing machine could require complex interactions between processors.

• Pointer chasing across the whole machine is another case where the difficulties of message-passing become apparent.

8

Message Passing

Can be considered as a programming model

Typically SPMD (Single Program Multiple Data) and not MPMD

Program is sequential code in Fortran , C or C++

All variable are local. No shared variables.

9

Alternatives

High-Performancre Fortran

Co-Array Fortran

Unified Parallel C (UPC)

10

Messages

Message operations are verbose in MPI because of its library implementation (as opposed to language implementation)

Must specify:

• Which process is sending the message

• Where is the data in the sending process

• What kind of data is being sent

• How much data is being sent

• Which process is going to receive the message

• Where shoudl the data be left in the receiving process

• What amount of data is the receiving process prepared to accept.

e send and

esses that can s such as

ceiving. Within at ranges from re is a default esses that is

ology icator into some ian

Features of MPI

MPI has a number of useful features beyond threceive capabilities.

• Communicators. A subset of the active procbe treated as a group for collective operationbroadcast, reduction, barriers, sending or reeach communicator, a process has a rank thzero to the size of the group minus one. Thecommunicator that refers to all the MPI proccalled MPI_COMM_WORLD.

• Topologies. A communicator can have a topassociated with it. This arranges a communlayout. The most common layout is a cartesdecomposition.

le styles of blocking. Users nding or allow capabilities putation.

e calls in MPI ll. For example, ll the processes

ting. O’Reilly 1998).

• Communication modes. MPI supports multipcommunication, including blocking and non-can also choose to use explicit buffers for seMPI to manage the buffers. The nonblockingallow the overlap of communication and com

• Single-call collective operations. Some of thautomate collective operations in a single cathere is a single call to sum values across ato a single value.

(From K. Dowd and C. Severance. High Performance Compu

16

ORLD,rank)

ORLD,size)

k," of ", size


Writing MPI Programs

include ’mpif.h’

integer rank, size

call MPI_INIT(ierr)

call MPI_COMM_RANK(MPI_COMM_W

call MPI_COMM_SIZE(MPI_COMM_W

print *,"hello world I’m ", ran

call MPI_FINALIZE(ierr)

end


17

Typical output on a SMP hello world I’m 1 of 20 hello world I’m 2 of 20 hello world I’m 3 of 20 hello world I’m 4 of 20 hello world I’m 5 of 20 hello world I’m 6 of 20 hello world I’m 7 of 20 hello world I’m 9 of 20 hello world I’m 11 of 20 hello world I’m 12 of 20 hello world I’m 10 of 20 hello world I’m 8 of 20 hello world I’m 14 of 20 hello world I’m 13 of 20 hello world I’m 15 of 20 hello world I’m 16 of 20 hello world I’m 19 of 20 hello world I’m 0 of 20 hello world I’m 18 of 20 hello world I’m 17 of 204.03u 0.43s 0:01.88e 237.2%

18

Commentary

• include ’mpif.h’ provides basic MPI definitions and types

• call MPI_INIT starts MPI

• call MPI_FINALIZE exits MPI

• call MPI_COMM_RANK(MPI_COMM_WORLD,rank) returns the rank of the process making the subroutine call. Notice that this rank in within the default communicator.

• call MPI_COMM_SIZE(MPI_COMM_WORLD,size) returns the total number of processes involved the execution of the MPI program.

(From W. Gropp’s transparencies: Tutorial on MPI. http://www.mcs.anl.gov/mpi ).

19

MPI

g, comm)

occurrences of .

up associated

ssage.

MIT Press 1996 ).

Send and Receive Operations in The basic (blocking) send operation in MPI is

MPI_SEND(buf, count, datatype, dest, ta

where

• (buf, count, datatype) describes count items of the form datatype starting at buf

• dest is the rank of the destination in the growith the communicator comm.

• tag is an integer to restrict receipt of the me

(From W. Gropp E. Lusk, and A. Skejellum. Using MPI.

The receive operation has the formMPI_RECV(buf, count, datatype, source, tag, comm, status)

where

• count is the size of the receive buffer

• source is the id of source process, or MPI_ANY_SOURCE

• tag is a message tag, or MPI_ANY_TAG

• status contains the source, tag, and count of the message actually received.

Broadcast and Reduction

The routine MPI_BCAST sends data from one process to all others.

The routine MPI_REDUCE combines data from all processes (by adding them in the example shown next), and returnting the result to a single program.

Second MPI Example: PI program main include ’mpif.h’ double precision PI25DT parameter (PI25DT = 3.141592653589793238462643d0)

double precision mypi, pi, h, sum, x, f, a integer n, myid, numprocs, i, rcc function to integrate f(a) = 4.d0 / (1.d0 + a*a)

call MPI_INIT( ierr ) call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr ) call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr ) print *, "Process ", myid, " of ", numprocs, " is alive"

sizetype = 1 sumtype = 2 10 if ( myid .eq. 0 ) then write(6,98) 98 format(’Enter the number of intervals: (0 quits)’) read(5,99) n 99 format(i10) endif

call MPI_BCAST(n,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)c check for quit signal if ( n .le. 0 ) goto 30c calculate the interval size h = 1.0d0/n

sum = 0.0d0 do 20 i = myid+1, n, numprocs x = h * (dble(i) - 0.5d0) sum = sum + f(x) 20 continue mypi = h * sum

c collect all the partial sums call MPI_REDUCE(mypi,pi,1,MPI_DOUBLE_PRECISION,MPI_SUM,0, $ MPI_COMM_WORLD,ierr)

c node 0 prints the answer. if (myid .eq. 0) then write(6, 97) pi, abs(pi - PI25DT) 97 format(’ pi is approximately: ’, F18.16, + ’ Error is: ’, F18.16) endif goto 10 30 call MPI_FINALIZE(rc) stop end

(From W. Gropp’s transparencies: Tutorial on MPI. http://www.mcs.anl.gov/mpi ).

Mesage-Passing Parallel Programming with MPI

Documents

Transcript of Mesage-Passing Parallel Programming with MPI