Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W...
Transcript of Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W...
![Page 1: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/1.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science
Parallel & Concurrent Programming:
OpenMPEmery BergerCMPSCI 691WSpring 2006
![Page 2: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/2.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 2
Outline
Last time(s):MPI – point-to-point & collective
Library calls
Today:OpenMP - parallel directives
Language extensions to Fortran/C/C++
![Page 3: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/3.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 3
Motivation
Take vectors a & b (100 ints)Distribute across all processorsEach processor:
Compute sum of all a[i] * b[i]Print overall sum
MPI: Use MPI_Scatter, MPI_Gather orMPI_Reduce
MPI_Scatter/Gather(sendbuf, cnt, type, recvbuf, recvcnt, type, root, comm)MPI_Reduce(sendbuf, recvbuf, cnt, type, op, root, comm)
![Page 4: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/4.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 4
MPI SolutionMPI_Init (&argc, &argv);MPI_Comm_rank (MPI_COMM_WORLD, &rank);MPI_Comm_size (MPI_COMM_WORLD, &size);
// Distribute a and bMPI_Scatter (a, 100, MPI_INT, a1, 100 / size, MPI_INT, 0, MPI_COMM_WORLD);MPI_Scatter (b, 100, MPI_INT, b1, 100 / size, MPI_INT, 0, MPI_COMM_WORLD);
// Multiply each chunkfor (int i = 0; i < 100/size; i++) {
z += a[i] *b1[i];}
// Reduce by summingif (rank == 0) {z1 = new int[size]; }MPI_Reduce (&z, &z, 1, MPI_INT, MPI_OP_PLUS, 0, MPI_COMM_WORLD);
// Output resultif (rank == 0) {
cout << z << endl; }
![Page 5: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/5.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 5
Ideal Solution
int z = 0;parallel for (i = 0; i < nProcessors; i++) {z += a[i] * b[i];
}cout << z << endl;
![Page 6: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/6.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 6
OpenMP Solution
int z = 0;#pragma omp forfor (int i = 0; i < 100; i++) {z += a[i] * b1[i];
}cout << z << endl;
OpenMP pragma directivesOmit = sequential programMore declarative styleAdd more pragmas for more efficiency
![Page 7: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/7.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 7
OpenMP Concepts
Fork-join modelOne thread executes sequential codeUpon reaching parallel directive:
Start new team of work-sharing threadsWait until all done (usually barrier)Can be nested!
Apparent global shared memory but relaxed consistency model
![Page 8: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/8.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 8
Consistency
Consistency =ordering of reads & writes
In same thread, across threads
Most “intuitive” consistency model = sequential consistency (Lamport)
Behaves like some sequential executionBUT: seriously limits parallelism
Must synchronize frequently
![Page 9: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/9.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 9
OpenMP Consistency
OpenMP: consistency across flushesWrites set of variables to memoryIf two flushes have intersecting sets, flushes must be seen in some sequential order by all threads
![Page 10: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/10.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 10
Parallel Execution
#pragma omp parallelExecutes next chunk of code across all or some number of threads
num_threads(n)
Only “master thread” continues after parallel section completes
![Page 11: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/11.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 11
Dynamic Threads
![Page 12: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/12.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 12
Parallel + nowait
Implicit barrier unless nowaitBarrier = flush operation
![Page 13: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/13.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 13
Parallel + Memory
Memory model:Heap objects sharedStack objects private
Includes loop iterators
unless indicated otherwise...
![Page 14: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/14.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 14
Parallel Example
![Page 15: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/15.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 15
Data-Sharing Attributes
sharedprivate
Each thread gets own private copyUndefined value
firstprivateCopies in original value
lastprivateCopies out private value
![Page 16: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/16.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 16
Lastprivate Example
![Page 17: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/17.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 17
Threadprivate Example
Can also declare variables as alwaysthread-private
![Page 18: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/18.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 18
Reduce
reductionprivate value per threadinitialized “appropriately”
uses predefined operators
copies out to originalreduction(+:a)
initializes a = 0reduction(*:1)
initializes a = 1
![Page 19: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/19.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 19
OpenMP Solution
int z = 0;#pragma omp for reduction(+:z)for (int i = 0; i < 100; i++) {z += a[i] * b1[i];
}cout << z << endl;
OpenMP pragma directivesOmit = sequential programMore declarative styleAdd more pragmas for more efficiency
![Page 20: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/20.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 20
All Together
![Page 21: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/21.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 21
But Still Races...
![Page 22: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/22.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 22
Master & Synchronization
masterAlways run by master thread
criticalDeclares critical section (one thread at a time)Can add names for greater concurrency
barrieratomic
Updated atomically (a++, a--, etc.)ordered
Executes loop body sequentially
![Page 23: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/23.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 23
Atomic Example
![Page 24: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/24.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 24
The End
![Page 25: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/25.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 25
Single Example
![Page 26: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/26.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 26
![Page 27: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/27.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 27
Ordered For
![Page 28: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/28.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 28
Copyin Example
![Page 29: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/29.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 29
Copyprivate Example
![Page 30: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/30.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 30
![Page 31: Parallel & Concurrent Programming: OpenMPemery/classes/cmpsci... · OpenMP Emery Berger CMPSCI 691W Spring 2006. UNIVERSITY OF MASSACHUSETTS AMHERST ...](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c68b77038e26661acb30/html5/thumbnails/31.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS AAMHERST MHERST •• Department of Computer ScienceDepartment of Computer Science 31
The End
Next time:OpenMP