Contemporary Languages in Parallel Computing Raymond Hummel.
-
Upload
gary-underwood -
Category
Documents
-
view
219 -
download
0
Transcript of Contemporary Languages in Parallel Computing Raymond Hummel.
![Page 1: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/1.jpg)
Contemporary Languages in Parallel ComputingRaymond Hummel
![Page 2: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/2.jpg)
Current Languages
![Page 3: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/3.jpg)
Standard Languages• Distributed Memory Architectures
MPI
• Shared Memory Architectures OpenMP pthreads
• Graphics Processing Units CUDA OpenCL
![Page 4: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/4.jpg)
Use in Academia• Journal articles referencing parallel languages and
libraries MPI – 863 CUDA – 539 OpenMP – 391 OpenCL – 195 Posix - 124
![Page 5: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/5.jpg)
MPI• Stands for: Message Passing Interface
• Pros Extremely Scalable
Remains the dominant model for high performance computing today
Can be used to tie implementations in other languages together
Portable Can be run in almost all OS/hardware combinations Bindings exist for multiple languages, from Fortran to Python
Can harness a multitude of hardware setups MPI programs can run on both distributed memory and
shared memory systems
![Page 6: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/6.jpg)
MPI• Cons
Complicated Software Requires the programmer to wrap their head around all
aspects of parallel execution Single program must handle the behavior of every process
Complicated Hardware Building and maintaining a cluster isn’t easy
Complicated Setup Jobs have to be run using mpirun or mpiexec Requires mpicc to link mpi libraries for compiler
![Page 7: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/7.jpg)
MPI
![Page 8: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/8.jpg)
OpenMP• Stands for: Open Multi-Processing
• Pros Incremental Parallelization
Parallelize just that pesky triple for-loop
Portable Does require compiler support, but all major compilers
already support it
Simple Software Include the library, add a preprocessor directive, compile
with a special flag
![Page 9: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/9.jpg)
OpenMP• Cons
Limited Use-Case Constrained to shared memory architectures 63% of survey participants from http://
goparallel.sourceforge.net were focused on development for individual desktops and servers
Scalability limited by memory architecture Memory bandwidth is not scaling at the same rate as
computation speeds
![Page 10: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/10.jpg)
OpenMP
![Page 11: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/11.jpg)
POSIX Threads• Stands for: Portable Operating System Interface
Threads
• Pros Fairly Portable
Native support in UNIX operating systems Versions exist for Windows as well
Fine Grained Control Can control mapping of threads to processors
![Page 12: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/12.jpg)
POSIX Threads• Cons
All-or-Nothing Can’t use software written with pthreads on systems that
don’t have support for it Major rewrite of main function required
Complicated Software Thread management
Limited Use-Case
![Page 13: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/13.jpg)
POSIX Threads
![Page 14: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/14.jpg)
CUDA• Stands for: Compute Unified Device Architecture
• Pros Manufacturer Support
NVIDIA is actively encouraging CUDA development Provide lots of shiny tools for developers
Low Level Hardware Access Because Cross-Platform Portability isn’t a priority, NVIDIA can
expose low-level details
![Page 15: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/15.jpg)
CUDA• Cons
Limited Use-Case GPU computing requires massive data parallelism
Only Compatible with NVIDIA Hardware
![Page 16: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/16.jpg)
CUDA
![Page 17: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/17.jpg)
OpenCL• Stands for: Open Compute Language
• Pros Portability
Works on all major operating systems
Heterogeneous Platform Works on CPUs, GPUs, APUs, FPGAs, coprocessors, etc…
Works with All Major Manufacturers AMD, Intel, NVIDIA, Qualcomm, ARM, and more
![Page 18: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/18.jpg)
OpenCL• Cons
Complicated Software Manual Everything
Special Tuning Required Because it cannot assume anything about the hardware on
which it will run, programmer has to tell it the best way to do things
![Page 19: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/19.jpg)
![Page 20: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/20.jpg)
Non-Standard Languages• CILK
• OpenACC
• C++ AMP
![Page 21: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/21.jpg)
CILK• Language first developed by MIT
• Based on C, commercial improvements extend it to C++
• Championed by Intel
• Operates on the theory that the programmer should identify parallelism, then let the run-time divide the work between processing elements
• Has only 5 keywords: cilk, spawn, sync, inlet, abort
• CILK Plus implementation merged into version 4.9 of the GNU C and C++ compilers
![Page 22: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/22.jpg)
OpenACC• Stands for: Open ACCelerators
• Not currently supported by major compilers
• Aims to function like OpenMP, but for heterogeneous CPU/GPU systems
• NVIDIA’s answer to OpenCL
![Page 23: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/23.jpg)
C++ AMP• Stands for: C++ Accelerated Massive Parallelism
• Library implemented on DirectX 11 and an open specification by Microsoft
• Visual Studio 2012 and up provide Debugging and Profiling support
• Works on any hardware that has DirectX 11 drivers
![Page 24: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/24.jpg)
Future Languages
![Page 25: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/25.jpg)
Developing Languages• D
• Rust
• Harlan
![Page 26: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/26.jpg)
D• Performance of Compiled Languages
• Memory Safety
• Expressiveness of Dynamic Languages
• Includes a Concurrency Aware Type-System
• Nearing Maturity
![Page 27: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/27.jpg)
Rust• Designed for creation of large Client-Server
Programs on the Internet
• Safety
• Memory Layout
• Concurrency
• Still Major Changes Occurring
![Page 28: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/28.jpg)
Harlan• Experimental Language
• Based on Scheme
• Designed to take care of boilerplate for GPU Programming
• Could be expanded to include automatic scheduling for both CPU and GPU, depending on available resources.
![Page 29: Contemporary Languages in Parallel Computing Raymond Hummel.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649d8a5503460f94a70c7c/html5/thumbnails/29.jpg)
Questions?