Mesage-Passing Parallel Programming with MPI
Transcript of Mesage-Passing Parallel Programming with MPI
1
Mesage-Passing Parallel Programming with MPI
2
What is MPI?
MPI (Message-Passing Interface) is a message-passing library specification that can be used to write parallel programs for parallel computers, clusters, and heterogeneous networks.
Portability across platforms is a crucial advantage of MPI. Not only can MPI programs run on any distributed-memory machine and multicomputer, but they can also run efficiently on shared-memory machines. OpenMP programs can only run efficiently on machines with hardware support for shared memory.
Like OpenMP, MPI was designed with the participation of several computer vendors (IBM, Intel, Cray, Convex, Meiko, Ncube) and software houses (KAI, ParaSoft). Furthermore, several Universities participated in the design of MPI.
3
e exchange of
.
r’s memory is
w.mcs.anl.gov/mpi ).
ss 1
ata)
Cooperative operationsMessage-passing is an approach that makes thdata cooperative.
Data mush both be explicitly sent and received
An advantage is that any change in the receivemade with the receiver’s participation.
(From W. Gropp’s transparencies: Tutorial on MPI. http://ww
Process 0 Proce
send(data)
recv(d
4
ses include
hout waiting for
w.mcs.anl.gov/mpi ).
ss 1
ata)
ss 1
ory)
One-sided operations
One-sided operations between parallel procesremote memory reads and writes.
An advantage is that data can be accesses witanother process.
(From W. Gropp’s transparencies: Tutorial on MPI. http://ww
Process 0 Proce
(Memory)
get(d
Process 0 Proce
put(data)
(Mem
5
s that are easy cessor using n.
Comparison
One-sided operations tend to produce programto read. With one sided operations only the prothe data has to participate in the communicatio
6
the statement a and c are in
with a global ecause the ed to
• Example: The following code would executea=f(b,c) in processor 1 of a multicomputer ifthe memory of processor 2.
in processor 1: receive (2,x)y = f(b,x)send (2,y)
in processor 2: send (1,c) receive(1,a)
In a shared-memory machine or a machine address space, the code would be simpler bprogram running on processor 2 does not neparticipate in the computation:
in processor 1: x := get(2,c) y = f(b,x) put(2,a) := y
7
• Example: To execute the loop do i=1,n
a(i) = x(k(i)) end do
in parallel on a message passing machine could require complex interactions between processors.
• Pointer chasing across the whole machine is another case where the difficulties of message-passing become apparent.
8
Message Passing
Can be considered as a programming model
Typically SPMD (Single Program Multiple Data) and not MPMD
Program is sequential code in Fortran , C or C++
All variable are local. No shared variables.
9
Alternatives
High-Performancre Fortran
Co-Array Fortran
Unified Parallel C (UPC)
10
Messages
Message operations are verbose in MPI because of its library implementation (as opposed to language implementation)
Must specify:
• Which process is sending the message
• Where is the data in the sending process
• What kind of data is being sent
• How much data is being sent
• Which process is going to receive the message
• Where shoudl the data be left in the receiving process
• What amount of data is the receiving process prepared to accept.
11
12
13
e send and
esses that can s such as
ceiving. Within at ranges from re is a default esses that is
ology icator into some ian
Features of MPI
MPI has a number of useful features beyond threceive capabilities.
• Communicators. A subset of the active procbe treated as a group for collective operationbroadcast, reduction, barriers, sending or reeach communicator, a process has a rank thzero to the size of the group minus one. Thecommunicator that refers to all the MPI proccalled MPI_COMM_WORLD.
• Topologies. A communicator can have a topassociated with it. This arranges a communlayout. The most common layout is a cartesdecomposition.
le styles of blocking. Users nding or allow capabilities putation.
e calls in MPI ll. For example, ll the processes
ting. O’Reilly 1998).
• Communication modes. MPI supports multipcommunication, including blocking and non-can also choose to use explicit buffers for seMPI to manage the buffers. The nonblockingallow the overlap of communication and com
• Single-call collective operations. Some of thautomate collective operations in a single cathere is a single call to sum values across ato a single value.
(From K. Dowd and C. Severance. High Performance Compu
16
ORLD,rank)
ORLD,size)
k," of ", size
w.mcs.anl.gov/mpi ).
Writing MPI Programs
include ’mpif.h’
integer rank, size
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_W
call MPI_COMM_SIZE(MPI_COMM_W
print *,"hello world I’m ", ran
call MPI_FINALIZE(ierr)
end
(From W. Gropp’s transparencies: Tutorial on MPI. http://ww
17
Typical output on a SMP hello world I’m 1 of 20 hello world I’m 2 of 20 hello world I’m 3 of 20 hello world I’m 4 of 20 hello world I’m 5 of 20 hello world I’m 6 of 20 hello world I’m 7 of 20 hello world I’m 9 of 20 hello world I’m 11 of 20 hello world I’m 12 of 20 hello world I’m 10 of 20 hello world I’m 8 of 20 hello world I’m 14 of 20 hello world I’m 13 of 20 hello world I’m 15 of 20 hello world I’m 16 of 20 hello world I’m 19 of 20 hello world I’m 0 of 20 hello world I’m 18 of 20 hello world I’m 17 of 204.03u 0.43s 0:01.88e 237.2%
18
Commentary
• include ’mpif.h’ provides basic MPI definitions and types
• call MPI_INIT starts MPI
• call MPI_FINALIZE exits MPI
• call MPI_COMM_RANK(MPI_COMM_WORLD,rank) returns the rank of the process making the subroutine call. Notice that this rank in within the default communicator.
• call MPI_COMM_SIZE(MPI_COMM_WORLD,size) returns the total number of processes involved the execution of the MPI program.
(From W. Gropp’s transparencies: Tutorial on MPI. http://www.mcs.anl.gov/mpi ).
19
MPI
g, comm)
occurrences of .
up associated
ssage.
MIT Press 1996 ).
Send and Receive Operations in The basic (blocking) send operation in MPI is
MPI_SEND(buf, count, datatype, dest, ta
where
• (buf, count, datatype) describes count items of the form datatype starting at buf
• dest is the rank of the destination in the growith the communicator comm.
• tag is an integer to restrict receipt of the me
(From W. Gropp E. Lusk, and A. Skejellum. Using MPI.
The receive operation has the formMPI_RECV(buf, count, datatype, source, tag, comm, status)
where
• count is the size of the receive buffer
• source is the id of source process, or MPI_ANY_SOURCE
• tag is a message tag, or MPI_ANY_TAG
• status contains the source, tag, and count of the message actually received.
Broadcast and Reduction
The routine MPI_BCAST sends data from one process to all others.
The routine MPI_REDUCE combines data from all processes (by adding them in the example shown next), and returnting the result to a single program.
Second MPI Example: PI program main include ’mpif.h’ double precision PI25DT parameter (PI25DT = 3.141592653589793238462643d0)
double precision mypi, pi, h, sum, x, f, a integer n, myid, numprocs, i, rcc function to integrate f(a) = 4.d0 / (1.d0 + a*a)
call MPI_INIT( ierr ) call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr ) call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr ) print *, "Process ", myid, " of ", numprocs, " is alive"
sizetype = 1 sumtype = 2 10 if ( myid .eq. 0 ) then write(6,98) 98 format(’Enter the number of intervals: (0 quits)’) read(5,99) n 99 format(i10) endif
call MPI_BCAST(n,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)c check for quit signal if ( n .le. 0 ) goto 30c calculate the interval size h = 1.0d0/n
sum = 0.0d0 do 20 i = myid+1, n, numprocs x = h * (dble(i) - 0.5d0) sum = sum + f(x) 20 continue mypi = h * sum
c collect all the partial sums call MPI_REDUCE(mypi,pi,1,MPI_DOUBLE_PRECISION,MPI_SUM,0, $ MPI_COMM_WORLD,ierr)
c node 0 prints the answer. if (myid .eq. 0) then write(6, 97) pi, abs(pi - PI25DT) 97 format(’ pi is approximately: ’, F18.16, + ’ Error is: ’, F18.16) endif goto 10 30 call MPI_FINALIZE(rc) stop end
(From W. Gropp’s transparencies: Tutorial on MPI. http://www.mcs.anl.gov/mpi ).