Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II...

61
Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI

Transcript of Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II...

Page 1: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Introduction to MPI MPI programming Running MPI program Architecture of MPICH

Lecture 2: Part IIMessage Passing

Programming: MPI

Page 2: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Message Passing Interface (MPI)

Page 3: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

What is MPI?

A message passing library specification– message-passing model– not a compiler specification– not a specific product

For parallel computers, clusters and heterogeneous networks.

Full-featured

Page 4: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Why use MPI? (1)

Message passing now mature as programming paradigm

well understood efficient match to hardware many applications

Page 5: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Why use MPI? (2)

Full range of desired features– modularity– access to peak performance– portability– heterogeneity– subgroups– topologies– performance measurement tools

Page 6: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Who Designed MPI ?

Venders– IBM, Intel, TMC, SGI, Meiko, Cray,

Convex, Ncube,….. Library writers

– PVM, p4, Zipcode, TCGMSG, Chameleon, Express, Linda, DP (HKU), PM (Japan), AM (Berkeley), FM (HPVM at Illinois)

Application specialists and consultants

Page 7: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Cho-Li Wang 7

Vender-Supported MPI

HP-MPI Hewlett Packard; Convex SPPMPI-F IBM SP1/SP2Hitachi/MPI HitachiSGI/MPI SGI PowerChallenge seriesMPI/DE NEC.INTEL/MPI Intel. Paragon (iCC lib)T.MPI Telmat MultinodeFujitsu/MPI Fujitsu AP1000EPCC/MPI Cray & EPCC, T3D/T3E.

Page 8: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Cho-Li Wang 8

Public-Domain MPI

MPICH Argonne National Lab. &

Mississippi State Univ. LAM Ohio Supercomputer center MPICH/NT Mississippi State University MPI-FM Illinois (Myrinet) MPI-AM UC Berkeley (Myrinet) MPI-PM RWCP, Japan (Myrinet) MPI-CCL California Institute of Technology

Page 9: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Public-Domain MPI

CRI/EPCC MPI Cray Research and Edinburgh Parallel Computing Centre (Cray

T3D/E)

MPI-AP Australian National University-

CAP Research Program (AP1000)

W32MPI Illinois, Concurrent Systems RACE-MPI Hughes Aircraft Co. MPI-BIP INRIA, France (Myrinet)

Page 10: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Communicator Conceptin MPI

Identify the process group and context with respect to which the operation is to be performed

Page 11: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Process

Process

Process

Process

ProcessProcess

Process

Process

Process

Process

Process

Communicator (2)Four communicatorsProcess in different communicators

cannot communicate

Process

Process

Process

Process

ProcessProcess

Communicator within Communicator

Process

Process

Same process can be existed in different

communicators

Process

Page 12: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Features of MPI (1)

General– Communicators combine context and

group for message security

Page 13: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Features of MPI (2)

Point-to-point communication Structured buffers and derived data

types, heterogeneity Modes : normal (blocking and non-

blocking), synchronous, ready (to allow access to fast protocols), buffered

Page 14: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Collective Communication Both built-in and user-defined collective

operations Large number of data movement routines Subgroups defined directly or by topology E.g, broadcast, barrier, reduce, scatter,

gather, all-to-all, ..

Features of MPI (3)

Page 15: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPI Programming

Page 16: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Writing MPI programs

MPI comprises 125 functions Many parallel programs can be written

with just 6 basic functions

Page 17: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Six basic functions (1)

MPI_INITInitiate an MPI computation

MPI_FINALIZETerminate a computation

Page 18: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Six basic functions (2)

MPI_COMM_SIZEDetermine number of processes in a communicator

MPI_COMM_RANKDetermine the identifier of a process in a specific communicator

Page 19: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Six basic functions (3)

MPI_SENDSend a message from one process to another process

MPI_RECVReceive a message from one process to another process

Page 20: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Program main

begin

MPI_INIT()

MPI_COMM_SIZE(MPI_COMM_WORLD, count)

MPI_COMM_RANK(MPI_COMM_WORLD, myid)

print(“I am ”, myid, “ of ”, count)

MPI_FINALIZE()

end

A simple program

MPI_INIT()

Initiate computation

MPI_COMM_SIZE(MPI_COMM_WORLD, count)

Find the numberof processes

MPI_COMM_RANK(MPI_COMM_WORLD, myid)

Find the process ID ofcurrent process

print(“I am “, myid, “ of “, count)

Each process prints out its output

MPI_FINALIZE()

Shut down

Page 21: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Result

I’m 3 of 4

I’m 0 of 4

I’m 1 of 4

I’m 2 of 4

Process 0 Process 4

Process 1Process 3

Page 22: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Point-to-Point Communication

The basic point-to-point communication operators are send and receive.

Sender Receiver

BufferBuffer

TransmissionSend Receive

Page 23: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Another simple program (2 nodes)

…..MPI_COMM_RANK(MPI_COMM_WORLD, myid)if myid=0 MPI_SEND(“Zero”,…,…,1,…,…) MPI_RECV(words,…,…,1,…,…,…)else MPI_RECV(words,…,…,0,…,…,…) MPI_SEND(“One”,…,…,0,…,…)END IFprint(“Received from “,words)……

I’m process 0!if myid=0 MPI_SEND(“Zero”,…,…,1,…,…) MPI_RECV(words,…,…,1,…,…,…)……

I’m process 1!else MPI_RECV(words,…,…,0,…,…,…) MPI_SEND(“One”,…,…,0,…,…)

Page 24: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Process 0 Process 1MPI_SEND(“Zero”,…,…,1,…,…)

MPI_RECV(words,…,…,0,…,…,…)

Send “Zero”to process 1

Setup buffer and wait the messagefrom process 0

words(buffer)

Received

WaitMPI_RECV(words,…,…,1,…,…)

MPI_SEND(“One”,…,…,0,…,…,…)

Setup buffer and wait the messagefrom process 1

Send “One”to process 0

Wait

words(buffer)

Received

Print(“Receivedfrom “,words)

Print(“Receivedfrom “,words)

Zero

One

Page 25: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Result

Received from One

Received from Zero

Process 0

Process 1

Page 26: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Collective Communication (1)

Communication that involves a group of processes

Sender

Receivers

Buffer

Buffer

TransmissionSend

Buffer

Buffer

Receive

Page 27: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Collective Communication (2)

Three Types Barrier

• MPI_BARRIER

Data movement• MPI_BCAST• MPI_GATHER• MPI_SCATTER

Reduction operations• MPI_REDUCE

Page 28: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Barrier

MPI_BARRIER Used to synchronize execution of a

group of processesWait for us!

We can’tgo on!

Barrier

Barrier

We’re together! The barrier will be disappeared!

Barrier

Let’s go!

Page 29: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

FACEFACE

Process 0 Process 1 Process 2 Process 3

BCAST BCAST BCAST BCAST

FACE FACE FACE

Data movement (1)

MPI_BCAST One single process sends the same

data to all other processes, itself included

Page 30: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Process 0 Process 1 Process 2 Process 3

GATHER GATHER GATHER GATHER

EA C EF FACFACE

Data movement (2)

MPI_GATHER All process (include the root process)

send the same data to one process and store them in rank order

Page 31: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Process 0 Process 1 Process 2 Process 3

SCATTER SCATTER SCATTER SCATTER

FACEF C EA

Data movement (3)

MPI_SCATTER A process sends out a message, which

is split into several equals parts, and the ith portion is sent to the ith process

Page 32: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Process 0 Process 1 Process 2 Process 3

REDUCE REDUCE REDUCE REDUCE

9 3 789

8 9 3 7max

Data movement (4)

MPI_REDUCE (e.g., find maximum value)

combine the values of each process, using a specified operation, and return the combined value to a process

Page 33: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Example program (1)

Calculating the value of by:

1

02

dxx1

4

Page 34: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Example program (2)

……

MPI_BCAST(numprocs, …, …, 0, …)

for (i = myid + 1; i <= n; i += numprocs)

compute the area for each interval

accumulate the result in processes’

program data (sum)

MPI_REDUCE(&sum, …, …, …, MPI_SUM, 0, …)

if (myid == 0)

Output result

…… Boardcast the no. of process

MPI_BCAST(numprocs, …, …, 0, …)

Each process calculate specified areas

for (i = myid + 1; i <= n; i += numprocs)

compute the area for each interval

accumulate the result in processes’

program data (sum)

Sum up all the areas

MPI_REDUCE(&sum, …, …, …, MPI_SUM, 0, …)

Print the resultif (myid == 0)

Output result

Page 35: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Calculated by process 0Calculated by process 1Calculated by process 2Calculated by process 3

OK!

OK!

OK!

OK!

=3.141...

Start calculation!

Page 36: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPICH - A Portable Implementation of MPI

Argonne National Laboratory

Page 37: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

What is MPICH???

The first complete and portable implementation of full MPI standard.

‘CH’ stands for “Chameleon” symbol of adaptability and portability.

It contains a programming environment for working with MPI programs.

It includes a portable startup mechanism and libraries.

Page 38: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

How can I install it??? Install the packet mpich.tar.gz to a directory Use ‘./configure’ and ‘make >& make.log to

choose appropriate architecture and device and compile the file – Syntax: ./configure -device=DEVICE -

arch=ARCH_TYPE• ARCH_TYPE: specify the type of machine to be

configured• DEVICE: specify what kind of communication

device the system will choose - ch_p4 (TCP/IP)

Page 39: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

How to run an MPI Program

1 Edit mpich/util/machines/machines.XXXX, to contain names of machines of architecture xxxx. For example:

Computermercury

Computervenus

Computermars

Computerearth

The file should be in the format:

mercuryvenusearthmarsearthmars

Page 40: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

How to run an MPI Program

2 include “mpi.h” into the source program. 3 Compile program by using command

‘mpicc’ - mpicc -c foo.c4 Use ‘mpirun’ to run an MPI program.

mpirun will determine the environment for the program to run

Page 41: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

How to run an MPI Program

mpirun -np 4 a.out - a.out are going to run four processors for massively parallel processors

mpirun -arch sun4 -np2 -arch rs6000 -np 3 program

- Run a program on 2 sun4s and 3 rs6000s, with local machine being a sun4 (multiple architectures)

5

6

Page 42: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPIRUN (1)

How to start a mpi program? Use mpirun Examples:

– #mpirun -np 4 cpi– it starts four processes of cpi

Page 43: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPIRUN (2) What MPIRUN do?

– 1. Read the arguments to specify the environment of the mpi program.

i) How many processes should be started

ii) Which machines will the mpi program be started

iii) What device will be used (e.g. ch_p4)

– 2. Split the processes to the machines will be ran

– 3. Record down the split results in the PI???? file

Page 44: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPIRUN(3)

Example

Suppose using ch_p4 device– #mpirun -np 4 cpi

1. mpirun knows 4 processes need to be started

2. mpirun reads the machines file to find which machines can be ran

3. ch_p4 device will be used if no specified argument given in the command

Page 45: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPIRUN (4)

4. Split the tasks and save in PI???? file

File format:

<hostname> <no. of proc.> <program>

genius.cs.hku.hk 0 cpi

eagle.cs.hku.hk 1 cpi

dragon.cs.hku.hk 1 cpi

virtue.cs.hku.hk 1 cpi

5. Start the processes in remote machines by using “rsh”

Page 46: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Architecture of MPICH

Page 47: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Low Level LayerLow Level Layer

ABSTRACT

DEVICE

INTERFACE

ABSTRACT

DEVICE

INTERFACE

ABSTRACT

DEVICE

INTERFACE

Structure of MPICH

ABSTRACT

DEVICE

INTERFACE

ABSTRACT

DEVICE

INTERFACE

ABSTRACT

DEVICE

INTERFACE

ABSTRACT

DEVICE

INTERFACE

Low Level LayerLow Level LayerLow Level LayerLow Level LayerLow Level LayerLow Level LayerLow Level LayerLow Level LayerLow Level Layer

MPI PORTABLE API LIBRARY

MPICH ABSTRACT DEVICE

MPICH CHANNEL INTERFACE

Socket

TCP/IP

Shared

Memory

Vendor

Design

Page 48: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPICH - Abstract Device Interface

Interface between high-level MPI and low-level device.

Manages message packaging, buffering policies and handle heterogeneous communication.

4 sets of functions: – 1. Specify send or receive of a message.– 2. Data movement between API and hardware.– 3. Manage lists of pending messages.– 4. Provide information about execution environment.

Page 49: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPICH - The Channel Interface (1)

The interface transfer data from one process‘s address space to another’s.

Information is divided into two parts:– message envelop and data

It includes five functions:• MPID_SendControl, MPID_RecvAnyControl,

MPID_ControlMsgAvail - envelop information• MPID_SendChannel, MPID_RecvFromChannel - data

information

Page 50: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPICH - The Channel Interface (2)

Channel Interface adopt data exchange mechanism in accordance to the size of message.

Data Exchange Mechanism implemented:– Short, Eager, Rendezvous, Get

Page 51: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Protocol - Short

The size of data managed by this mechanism is shortest.

The data is delivered within the message envelop.

Page 52: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Data

Control MessageControl MessageControl Message

Reach

Short Protocol Data Transfer

Control MessageControl MessageControl MessageControl MessageControl MessageControl Message

Store in Buffer

Control MessageControl MessageControl MessageControl MessageControl Message

ReachReachReachReachReach

MPI_RecvMPI_RecvMPI_Recv

Page 53: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Protocol - Eager

Data is sent to the destination immediately.

The receiver must allocate some space to store the data locally.

It is the default choice in MPICH. It is not suitable for large amounts of

data transfer.

Page 54: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Eager Protocol Data Transfer

MPI_Control

Data

Save in Buffer

MPI_RecvMPI_Recv

Buffer Full!!!

MPI_ControlMPI_ControlMPI_ControlMPI_ControlMPI_ControlMPI_ControlMPI_ControlMPI_Control

Data Data Data Data DataData Data DataDataData1Data3

Data2Data4

MPI_RecvMPI_Recv

Page 55: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Protocol - Rendezvous

Data is sent to the destination only when requested.

If users want to use it, add -use_rndv in the command ‘./configure’.

No buffering required.

Page 56: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Rendezvous Protocol Data Transfer

MPI_Control

Data

Wait!MPI_ControlMPI_Control MPI_Cotrol MPI_Control MPI_ControlMPI_Control

MPI_Recv

MPI_RequestMPI_RequestMPI_RequestMPI_RequestMPI_RequestMPI_RequestMPI_Request

Data Data Data

MPI_Control

Wait Again!

Match!!!WaitData DataDataData DataReceived!

Page 57: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Protocol - Get

In this protocol, data is read directly by the receiver.

Data is directly transferred from one process’s memory to another.

Highest Performance. – require shared memory– remote memory operation

Page 58: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Get Protocol Data Transfer

I want to get data

from sender

Receiver directly access sender shared memory

Receiver directly copy data from sender shared memory to its memory

Page 59: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

Conclusion

Page 60: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPI–1.1 (June 95)

MPI 1.1 doesn’t provide process management remote memory transfers active messages threads virtual shared memory

Page 61: Introduction to MPI MPI programming Running MPI program Architecture of MPICH Lecture 2: Part II Message Passing Programming: MPI.

MPI–2 (July 97)

Extensions to the MPI process creation and management one-sided communications extended collective operations external interface I/O additional language bindings