Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted...
Transcript of Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted...
![Page 1: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/1.jpg)
Prototyping and
Developing GPU
Accelerated Solutions
with Python and CUDA
Luciano Martins
Principal Software Engineer
Oracle Corporation
![Page 2: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/2.jpg)
Agenda
Python introduction
GPU programming
Why python with GPU?
Accelerating python
Comparing codes with/without GPU support
Summary
![Page 3: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/3.jpg)
Python introduction
created by Guido van Rossum in 1991
"Zen of Python" which:
Beautiful is better than ugly
Explicit is better than implicit
Simple is better than complex
Complex is better than complicated
Readability counts
interpreted language (CPython, JPython, ...)
dynamically typed; based on objects
![Page 4: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/4.jpg)
Python introduction
small core structure:
~30 keywords
~80 built-in functions
indention is a pretty serious thing
a huge modules ecosystem
binds to many different languages
supports GPU acceleration via modules ←
![Page 5: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/5.jpg)
Python introduction
![Page 6: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/6.jpg)
GPU programming
“the use of a graphics processing unit (GPU)
together with a CPU to accelerate deep
learning, analytics, and engineering
applications” (NVIDIA)
most common GPU accelerated operations:
large vector/matrix operations (BLAS)
speech recognition
computer vision
way more
![Page 7: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/7.jpg)
GPU programming
![Page 8: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/8.jpg)
Why python with GPU?
Interpreted languages has the reputation of
being slow for high performance needs
Python needs assistance for those tasks
Keep the best of both scenarios:
Quick development and prototyping with python
Use high processing power and speed of GPU
Can deliver quick results for complex projects
Gives a business decision choice at the end
![Page 9: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/9.jpg)
Accelerating python
GPU + python projects are arising every day
Accelerated code may be pure python or
adding C code
Focusing here on the following modules
PyCUDA
Numba
cudamat
cupy
scikit-cuda
![Page 10: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/10.jpg)
Accelerating python – PyCUDA
A python wrapper to CUDA API
Requires C programming knowledge (kernel)
Gives speed to python – near zero wrapping
Compiles the CUDA code copy to GPU
CUDA errors translated to python exceptions
Easy installation
![Page 11: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/11.jpg)
Accelerating python – PyCUDA
![Page 12: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/12.jpg)
Accelerating python – PyCUDA
![Page 13: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/13.jpg)
Accelerating python – Numba
high performance functions written in Python
On-the-fly code generation
Native code generation for the CPU and GPU
Integration with the Python scientific stack
Take advantage of Python decorators
No need to write C code
Code translation done using LLVM compiler
![Page 14: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/14.jpg)
Accelerating python – Numba
![Page 15: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/15.jpg)
Accelerating python – cudamat
provides a CUDA-based python matrix class
Primary goal: easy dense matrix manipulation
Useful to perform matrix ops on GPU
Perform many matrix operations
multiplication and transpose
Elementwise addition, subtraction, multiplication,
and division
Elementwise application of exp, log, pow, sqrt
Summation, maximum and minimum along rows
or columns
![Page 16: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/16.jpg)
Accelerating python – cudamat
![Page 17: Prototyping and Developing GPU Accelerated Solutions with ......Why python with GPU? Interpreted languages has the reputation of being slow for high performance needs Python needs](https://reader036.fdocuments.us/reader036/viewer/2022080718/5f78cab8c73fb12cf75fc389/html5/thumbnails/17.jpg)
Accelerating python – cupy
an implementation of NumPy-compatible
multi-dimensional array on CUDA
Useful to perform matrix ops on GPU
CuPy is faster than NumPy in many ways