Message Passing Interface
description
Transcript of Message Passing Interface
![Page 2: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/2.jpg)
3D MHD Simulation Using MPI
• 2563 cells• Cray T3E-900• 64 PEs, 8000 hrs• Kim+ 1998, ApJL
![Page 3: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/3.jpg)
Contents• Parallel Computational Model• OpenMP, CUDA, MPI• Speedup, Amdahl’s Law• MPI-1
– Basic MPI routines– Examples of Simple MPI Program
• Hello world• • Matrix-Vector Multiplication• Laplace equation
• Arithmetic Intensity
![Page 4: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/4.jpg)
Parallel Computational Models
M
P
Memory
P P P
M
P
M
P
M
P
PC SMPSymmetric Multiprocessors
MPPMassively Parallel Processors
Multi-P PCIBM power series
Cray T3E
P
![Page 5: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/5.jpg)
Parallel Computational Models
(Beowulf) cluster
M
P P
M
P P
Network Switch
M
P P
Gigabit, Myrinet, Infiniband
![Page 6: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/6.jpg)
Serial Programming
• Languages– Fortran, C, C++
• Optimization of Serial Codes– Compilers, – opts flags, – math kernel, ….
![Page 7: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/7.jpg)
Speedup & Amdahl’s Law
• Speedup
• Linear (ideal) Speedup• Amdahl’s Law
– f: fraction of serial operations– p: number of processors
pp TTS 1
pS p
pff
S p
11
max,
![Page 8: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/8.jpg)
Parallel Programming
• OpenMP– SMP, incremental parallelization
• CUDA (openCL)– Coprocessor, incremental parallelization
• MPI– Cluster, needs sometimes many changes of serial
codes
![Page 9: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/9.jpg)
Hybrid Programming
• OpenMP (MPI) + CUDA– SMP, multiple GPUs
• MPI + OpenMP– Cluster of SMP nodes
• MPI + CUDA– Cluster, multiple GPUs
• MPI + OpenMP + CUDA
![Page 10: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/10.jpg)
MPI ReferencesGropp, W., Lusk, E., & Skjellum, A. 1999, Using MPI, 2ed
(Cambridge: MIT)Gropp, W., Lusk, E., & Thakur, R. 1999, Using MPI-2 (Cambridge:
MIT)
![Page 11: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/11.jpg)
MPI Forums• MPI Forum (‘92-’94)
– MPI-(1.0)• MPI-2 Forum (’95-‘97)
– MPI-1.1 (minor modification); – MPI-1.2– MPI-2; parallel I/O, RMO, DPM
• Later MPI Forum– MPI-1.3: final end of MPI-1 series– MPI-2.1: approved on Sep. 4, 2008– MPI-2.2: approved on Sep. 4, 2009
![Page 12: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/12.jpg)
MPI; Message Passing Interface• What is MPI?
– Communication library for Fortran, C, and C++• Nine basic routines
– MPI_INIT– MPI_FINALIZE– MPI_COMM_SIZE– MPI_COMM_RANK– MPI_SEND– MPI_RECV– MPI_BCAST– MPI_REDUCE– MPI_BARRIER
• One timing routine– MPI_WTIME
![Page 13: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/13.jpg)
• PuTTYhttp://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
Termial emulator for Windows
![Page 14: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/14.jpg)
• Wireless Network
• From outside of the KIASssh [email protected] pw:cac!@)!)
• From inside of the KIAS or netwonssh [email protected] pw:jskim!kias
• http://cac.kias.re.kr/School/2011winter ssh gpu[01-24]
• mkdir (your usual userid)• cd userid
Access to life and GPU clusters
![Page 15: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/15.jpg)
Hello World (sequential)
program main print*, 'hello world' end
![Page 16: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/16.jpg)
Hello World (parallel) program main include 'mpif.h' integer ierr call MPI_INIT(ierr) print*, 'hello world' call MPI_FINALIZE(ierr) end-----------------------------mpif77 hello_world_parallel_1.fmpiexec –n 2 ./a.out
![Page 17: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/17.jpg)
• Sequential programinput: nxoutput: sum
1
0
1
0)arctan(4
14 xdxx
![Page 18: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/18.jpg)
(cont.)
![Page 19: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/19.jpg)
serial)
program main double precision f, a, dx, x, sum, pi integer nx, ix
f(a) = 4.d0 / (1.d0 + a*a) print*, 'number of intervals:‘ read(*,*) nx dx = 1.0d0 / dfloat(nx)
![Page 20: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/20.jpg)
serial) sum = 0.d0 do 10 ix = 1, nx x = dx*(dfloat(ix)-0.5d0) sum = sum + f(x)10 continue pi = dx*sum print*, pi stop end
![Page 21: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/21.jpg)
program main
include 'mpif.h' double precision f, a, dx, x, sum, pip, pi integer nx, nxp, ix integer ierr, np, irank
f(a) = 4.d0 / (1.d0 + a*a)
call MPI_INIT(ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD,np,ierr) call MPI_COMM_RANK(MPI_COMM_WORLD,irank,ierr) if (irank .eq. 0) then print*, 'number of intervals:' read(*,*) nx endif
call MPI_BCAST(nx,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr) dx = 1.0d0 / dfloat(nx) nxp = nx / np
![Page 22: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/22.jpg)
sum = 0.0d0 do 10 ix = 1, nxp x = dx*(dfloat(irank*nxp+ix)-0.5d0) sum = sum + f(x) 10 continue pip = dx*sum
call MPI_REDUCE(pip,pi,1,MPI_DOUBLE_PRECISION,MPI_SUM,0, $ MPI_COMM_WORLD,ierr)
if (irank .eq. 0) then print*, pi endif
call MPI_FINALIZE(ierr) stop end
![Page 23: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/23.jpg)
Timing MPI programs
double precision starttime, endtimestarttime = MPI_WTIME()---endtime = MPI_WTIME()print*, endtime-starttime, ‘ seconds’
![Page 24: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/24.jpg)
Matrix-Vector Multiplication
c = A bMaster-slave (self-scheduling) algorithm
![Page 25: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/25.jpg)
program main integer MAX_ROWS, MAX_COLS, rows, cols parameter (MAX_ROWS = 1000, MAX_COLS = 1000) double precision a(MAX_ROWS,MAX_COLS), b(MAX_COLS), c(MAX_ROWS)
integer i, j
rows = 100 cols = 100
do 20 j = 1,cols b(j) = 1.0 do 10 i = 1,rows a(i,j) = i 10 continue 20 continue
do 30 i=1,rows c(i) = 0.0 do 40 j=1,cols c(i) = c(i) + a(i,j)*b(j) 40 continue print*, i, c(i) 30 continue
stop end
![Page 26: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/26.jpg)
program main include 'mpif.h' integer MAX_ROWS, MAX_COLS, rows, cols parameter (MAX_ROWS = 1000, MAX_COLS = 1000) double precision a(MAX_ROWS,MAX_COLS), b(MAX_COLS), c(MAX_ROWS) double precision buffer(MAX_COLS), ans
integer myid, master, numprocs, ierr, status(MPI_STATUS_SIZE) integer i, j, numsent, sender integer anstype, row
call MPI_INIT( ierr ) call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr ) call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr ) master = 0 rows = 100 cols = 100
if ( myid .eq. master ) thenc master initializes and then dispatchesc initialize a and b (arbitrary) do 20 j = 1,cols b(j) = 1 do 10 i = 1,rows a(i,j) = i 10 continue 20 continue
![Page 27: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/27.jpg)
numsent = 0c send b to each slave process call MPI_BCAST(b, cols, MPI_DOUBLE_PRECISION,
master, & MPI_COMM_WORLD, ierr)c send a row to each slave process; tag with row number do 40 i = 1,min(numprocs-1,rows) do 30 j = 1,cols buffer(j) = a(i,j) 30 continue call MPI_SEND(buffer, cols,
MPI_DOUBLE_PRECISION, i, & i, MPI_COMM_WORLD, ierr) numsent = numsent+1 40 continue
![Page 28: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/28.jpg)
do 70 i = 1,rows call MPI_RECV(ans, 1, MPI_DOUBLE_PRECISION, & MPI_ANY_SOURCE, MPI_ANY_TAG, & MPI_COMM_WORLD, status, ierr) sender =
status(MPI_SOURCE) anstype = status(MPI_TAG) ! row is tag value c(anstype) = ans if (numsent .lt. rows) then ! send another row do 50 j = 1,cols buffer(j) = a(numsent+1,j) 50 continue call MPI_SEND(buffer, cols, MPI_DOUBLE_PRECISION, & sender, numsent+1, MPI_COMM_WORLD, ierr) numsent = numsent+1 else ! Tell sender that there is no more work call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, & sender, 0, MPI_COMM_WORLD, ierr) endif 70 continue else
![Page 29: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/29.jpg)
c slaves receive b, then compute dot products until c done message received call MPI_BCAST(b, cols, MPI_DOUBLE_PRECISION, master, & MPI_COMM_WORLD, ierr)c skip if more processes than work if (rank .gt. rows) & goto 200 90 call MPI_RECV(buffer, cols, MPI_DOUBLE_PRECISION, master, & MPI_ANY_TAG, MPI_COMM_WORLD, status, ierr) if (status(MPI_TAG) .eq. 0) then go to 200 else row = status(MPI_TAG) ans = 0.0 do 100 i = 1,cols ans = ans+buffer(i)*b(i) 100 continue call MPI_SEND(ans, 1, MPI_DOUBLE_PRECISION, master, & row, MPI_COMM_WORLD, ierr) go to 90 endif 200 continue endif
call MPI_FINALIZE(ierr) stop end
![Page 30: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/30.jpg)
AI (Arithmetic Intensity)
• Definition: number of operations per byte• Matrix (N,N) – Vector(N) Multiplication
N(N+N-1)/(4N2+4N) ~ ¼ ops/byte• Matrix (N,N) – Matrix(N,N) Multiplication
N2(N+N-1)/(2x4N2) ~ N/8 ops/byte
![Page 31: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/31.jpg)
Example2: Laplace Equation
02 u
41,1,,1,11
,
nji
nji
nji
njin
ji
uuuuu
||,
,1
,
ji
nji
nji uu
Solution: Gauss Iteration
![Page 32: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/32.jpg)
Laplace Equation (cont.)
![Page 33: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/33.jpg)
program mainc integer nx, ny parameter (nx=16,ny=16) double precision u(0:nx+1,0:ny+1), unew(0:nx+1,0:ny+1) double precision eps, anormc-----------------------------------------------------------------------c tolerance c----------------------------------------------------------------------- eps = 1.d-5c-----------------------------------------------------------------------c initialization of uc----------------------------------------------------------------------- do 10 j=0,ny+1 do 10 i=0,nx+1 u(i,j) = 0.d0 10 continuec-----------------------------------------------------------------------c boundary conditionc----------------------------------------------------------------------- call bound (nx,ny,u)c-----------------------------------------------------------------------c Gauss iterationc----------------------------------------------------------------------- 100 continuec do 30 j=1,ny do 30 i=1,nx unew(i,j) = 0.25d0*(u(i-1,j)+u(i+1,j)+u(i,j-1)+u(i,j+1)) 30 continue
![Page 34: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/34.jpg)
c-----------------------------------------------------------------------c Compute a norm of difference between u and unewc----------------------------------------------------------------------- anorm = 0.d0 do 40 j=1,ny do 40 i=1,nx anorm = anorm + abs(unew(i,j)-u(i,j)) 40 continuec-----------------------------------------------------------------------c Update uc----------------------------------------------------------------------- do 50 j=1,ny do 50 i=1,nx u(i,j) = unew(i,j) 50 continuec if (anorm .gt. eps) go to 100c-----------------------------------------------------------------------c write outputc----------------------------------------------------------------------- open(unit=10,file='u.dat')c do 60 j=1,ny write(10,200) (u(i,j),i=1,nx) 60 continuec 200 format(16f5.2)c-----------------------------------------------------------------------c endc----------------------------------------------------------------------- stop end
![Page 35: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/35.jpg)
subroutine bound (nx,ny,u)c-----------------------------------------------------------------------c Dirichlet boundary conditionsc----------------------------------------------------------------------- integer i, j integer nx, ny double precision u(0:nx+1,0:ny+1)c-----------------------------------------------------------------------c left and right boundary conditionc----------------------------------------------------------------------- do 10 j = 1,ny u( 0,j) = 1.0d0 u(nx+1,j) = 0.0d0 10 continuec-----------------------------------------------------------------------c lower and upper boundary conditionc----------------------------------------------------------------------- do 20 i = 1,nx u(i,0) = 1.0d0 u(i,ny+1) = 0.0d0 20 continuec-----------------------------------------------------------------------c endc----------------------------------------------------------------------- return end
![Page 36: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/36.jpg)
Laplace Equation (cont.)
![Page 37: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/37.jpg)
program mainc implicit none include 'mpif.h'c integer nx, ny, np, nyp parameter (nx=16,ny=16,np=2,nyp=8) integer i, j double precision u(0:nx+1,0:nyp+1), unew(0:nx+1,0:nyp+1) double precision eps, anormp, anormc integer ierr, irank, status(MPI_STATUS_SIZE), ndestc character*8 fnamec-----------------------------------------------------------------------c tolerance c----------------------------------------------------------------------- eps = 1.d-5c-----------------------------------------------------------------------c MPI initializationc----------------------------------------------------------------------- call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD,irank,ierr)c-----------------------------------------------------------------------c initialization of uc----------------------------------------------------------------------- do 10 j=0,nyp+1 do 10 i=0,nx+1 u(i,j) = 0.d0 10 continuec-----------------------------------------------------------------------c boundary conditionc----------------------------------------------------------------------- call bound (nx,nyp,u,irank,np)
![Page 38: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/38.jpg)
c-----------------------------------------------------------------------c Gauss iterationc----------------------------------------------------------------------- 100 continuec do 20 j=1,nyp do 20 i=1,nx unew(i,j) = 0.25d0*(u(i-1,j)+u(i+1,j)+u(i,j-1)+u(i,j+1)) 20 continuec-----------------------------------------------------------------------c Compute a norm of difference between u and unewc----------------------------------------------------------------------- anormp = 0.d0 do 30 j=1,nyp do 30 i=1,nx anormp = anormp + abs(unew(i,j)-u(i,j)) 30 continue call MPI_REDUCE(anormp,anorm,1,MPI_DOUBLE_PRECISION,MPI_SUM,0, & MPI_COMM_WORLD,ierr) call MPI_BCAST(anorm,1,MPI_DOUBLE_PRECISION,0, & MPI_COMM_WORLD,ierr)c-----------------------------------------------------------------------c Update uc----------------------------------------------------------------------- do 40 j=1,nyp do 40 i=1,nx u(i,j) = unew(i,j) 40 continue
![Page 39: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/39.jpg)
c-----------------------------------------------------------------------c boundary condition between processorsc-----------------------------------------------------------------------c-----------------------------------------------------------------------c from i-th ranked process to (i+1)-th ranked process c----------------------------------------------------------------------- if (irank .eq. (np-1)) then ndest = 0 else ndest = irank+1 endifc call MPI_SEND (u(1,nyp),nx,MPI_DOUBLE_PRECISION, & ndest,irank,MPI_COMM_WORLD,ierr)c call MPI_BARRIER (MPI_COMM_WORLD,ierr)c call MPI_RECV (u(1,0),nx,MPI_DOUBLE_PRECISION, & MPI_ANY_SOURCE,MPI_ANY_TAG, & MPI_COMM_WORLD,status,ierr)c-----------------------------------------------------------------------c from i-th ranked process to (i-1)-th ranked process c----------------------------------------------------------------------- if (irank .eq. 0) then ndest = np-1 else ndest = irank-1 endifc call MPI_SEND (u(1,1),nx,MPI_DOUBLE_PRECISION, & ndest,irank,MPI_COMM_WORLD,ierr)c call MPI_BARRIER (MPI_COMM_WORLD,ierr)c call MPI_RECV (u(1,nyp+1),nx,MPI_DOUBLE_PRECISION, & MPI_ANY_SOURCE,MPI_ANY_TAG, & MPI_COMM_WORLD,status,ierr)
![Page 40: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/40.jpg)
c-----------------------------------------------------------------------c boundary conditionc----------------------------------------------------------------------- call bound (nx,nyp,u,irank,np)c-----------------------------------------------------------------------c Gauss iterationc----------------------------------------------------------------------- if (anorm .gt. eps) go to 100c-----------------------------------------------------------------------c write outputc----------------------------------------------------------------------- write(fname,900) 'u',irank,'.dat'c 900 format (a1,i3.3,a4)c open(unit=10,file=fname)c do 50 j=1,nyp write(10,200) (u(i,j),i=1,nx) 50 continuec 200 format(16f5.2)c-----------------------------------------------------------------------c MPI finalizationc----------------------------------------------------------------------- call MPI_FINALIZE(ierr)c-----------------------------------------------------------------------c endc----------------------------------------------------------------------- stop end
![Page 41: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/41.jpg)
subroutine bound (nx,nyp,u,irank,np)c-----------------------------------------------------------------------c Dirichlet boundary conditionsc----------------------------------------------------------------------- implicit none integer i, j integer nx, nyp, irank, np double precision u(0:nx+1,0:nyp+1)c-----------------------------------------------------------------------c left and right boundary conditionc----------------------------------------------------------------------- do 10 j = 1,nyp u( 0,j) = 1.0d0 u(nx+1,j) = 0.0d0 10 continuec-----------------------------------------------------------------------c lower boundary conditionc----------------------------------------------------------------------- if (irank .eq. 0) then do 20 i = 1,nx u(i,0) = 1.0d0 20 continue endifc-----------------------------------------------------------------------c upper boundary conditionc----------------------------------------------------------------------- if (irank .eq. np-1) then do 30 i = 1,nx u(i,nyp+1)= 0.0d0 30 continue endifc-----------------------------------------------------------------------c endc----------------------------------------------------------------------- return end
![Page 42: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/42.jpg)
Several SEND/RECV routines
• SEND/RECV• BSEND/BRECV• SENDRECV• SSEND/SRECV• ISEND/IRECV
![Page 43: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/43.jpg)
Contents
• What’s New in MPI-2?- Parallel I/O- Remote Memory Operations- Dynamic Process Management
• Parallel isothermal HD Code• Performance Benchmarks
![Page 44: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/44.jpg)
Parallel I/OSequential I/O from an parallel program
memory
processors
file
![Page 45: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/45.jpg)
Parallel I/O (cont.)Parallel I/O to multiple files
memory
processors
file
![Page 46: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/46.jpg)
! ! example of parallel MPI write into multiple filesPROGRAM main ! Fortran 90 users can (and should) use ! use mpi ! instead of include 'mpif.h' if their MPI implementation provides a ! mpi module. include 'mpif.h'
integer ierr, i, myrank, BUFSIZE, thefile parameter (BUFSIZE=100) integer buf(BUFSIZE) character*12 ofname
call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr)
do i = 1, BUFSIZE buf(i) = myrank * BUFSIZE + i enddo
write(ofname,'(a8,i4.4)') 'testfile',myrank
open(unit=11,file=ofname,form='unformatted') write(11) buf
call MPI_FINALIZE(ierr)
END PROGRAM main
![Page 47: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/47.jpg)
Parallel I/O (cont.)Parallel I/O to a single file
memory
processors
file
![Page 48: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/48.jpg)
! !!example of parallel MPI write into a single filePROGRAM main ! Fortran 90 users can (and should) use ! use mpi ! instead of include 'mpif.h' if their MPI implementation provides a ! mpi module. include 'mpif.h'
integer ierr, i, myrank, BUFSIZE, thefile parameter (BUFSIZE=100) integer buf(BUFSIZE) integer(kind=MPI_OFFSET_KIND) disp
call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr)
do i = 1, BUFSIZE buf(i) = myrank * BUFSIZE + i enddo call MPI_FILE_OPEN(MPI_COMM_WORLD, 'testfile', & MPI_MODE_WRONLY + MPI_MODE_CREATE, & MPI_INFO_NULL, thefile, ierr) ! assume 4-byte integers disp = myrank * BUFSIZE * 4 call MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & MPI_INTEGER, 'native', & MPI_INFO_NULL, ierr) call MPI_FILE_WRITE(thefile, buf, BUFSIZE, MPI_INTEGER, & MPI_STATUS_IGNORE, ierr) call MPI_FILE_CLOSE(thefile, ierr) call MPI_FINALIZE(ierr)
END PROGRAM main
![Page 49: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/49.jpg)
Remote Memory Access
Address space of Process 0 Address space of Process 1
RMA Window
RMA window
put
get
![Page 50: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/50.jpg)
Dynamic Process Management
• Intercommunicator– spawning; connecting
![Page 51: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/51.jpg)
Isothermal HD equations
• Isothermal HD equations
• Conservative Form
0 vt
02
at
vvv
0
zyxtzyx FFFq
z
y
x
vvv
q
zx
yx
x
x
x
vvvvav
v
22
F
zy
y
yx
y
y
vvavvvv
22F
22 avvvvvv
z
zy
zx
z
z
F
![Page 52: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/52.jpg)
Strang-type Directional Splitting
• Reduce the multi-dimensional problem to one-dimensional one.
0:
xtL xx
Fq 0:
ytL yy
Fq 0:
ztL zz
Fq
nxyzzyxyzxxzyzxyyzx
n qLLLLLLLLLLLLLLLLLL ))()()()()((6 q
nxyz
n qLLL1q
![Page 53: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/53.jpg)
Godunov’s Scheme0
xtFq
1i i 1i
21
i21
i
21
21
1
ii
ni
ni x
t FFqq
21
21
),(11 i
i
x
x
nni dxtx
xqq
nSxt
max
- Van Leer- PPM
![Page 54: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/54.jpg)
Riemann problems
1i i 1i21
i21
ix
t t
- Exact Riemann solver- Roe’s Scheme- HLL Scheme
![Page 55: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/55.jpg)
KASI-ARCSEC PC CLUSTER
•128 CPUs•128 GB Mem.•6TB Disk
![Page 56: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/56.jpg)
Test problem for a benchmark for the PC cluster
![Page 57: Message Passing Interface](https://reader036.fdocuments.us/reader036/viewer/2022062521/5681675d550346895ddc2bf3/html5/thumbnails/57.jpg)
Benchmark test of the MHD code