QGIS plugin for parallel processing in terrain analysis
-
Upload
ross-mcdonald -
Category
Technology
-
view
153 -
download
1
Transcript of QGIS plugin for parallel processing in terrain analysis
© Arthur J. Lembo, Jr.
Salisbury University
QGIS Plug-in for Parallel Processing in
Terrain Analysis'
Arthur Lembo
Department of Geography and Geoscience
@artlembo
© Arthur J. Lembo, Jr.
Salisbury University
If you were plowing a field, which would
you rather use? Two strong oxen or 1024
chickens?
- Seymour Cray
© Arthur J. Lembo, Jr.
Salisbury University
• As part of a NSF REU, we built a QGIS
plugin to perform terrain-based parallel
processing with Python.
• This presentation shows the results of our
undergraduate student research project
– Multicore processors
– Massively parallel GPGPUs
– Hardware evolution
– Our QGIS plug-in
– The road ahead
Overview
© Arthur J. Lembo, Jr.
Salisbury University
NSF REU
• The NSF REU is a 3 year grant
(extended to 5 years) focused on
parallel processing.
• The goal is to expose undergraduates
to academic research in computer
science.
• My role has been to mentor students in
the use of parallel processing in
geography.
© Arthur J. Lembo, Jr.
Salisbury University
• 1971 Intel 4004
• Ted Hoff
• $60,000
• 2,300 transistors
• 582,000,000 Quad
• Lithography
• Killed time-sharing
Microcomputer revolution
© Arthur J. Lembo, Jr.
Salisbury University
• 2x / 18 months
• Design Shrink
• Smaller = Faster
Moore’s Law
© Arthur J. Lembo, Jr.
Salisbury University
Trouble in paradise?
© Arthur J. Lembo, Jr.
Salisbury University
• Heat
– Limits on power density
– Package dissipation limit
– Watercooled overclocking
• Subunit Complexity
– Single clock cycle synchronicity
– AMD translation lookaside buffer bug
• RC Interconnect delay
Limits of Moore’s Law
© Arthur J. Lembo, Jr.
Salisbury University
• Parallel dies
• Parallel packages
• Core 2 Duo
• Core 2 Quad
Repeating history
Parallel microprocessors
© Arthur J. Lembo, Jr.
Salisbury University
• 64-bit just getting traction
• Windows Parallelism
• Multithreading difficult to Code
• Parallel code even harder
• Scientific Computing
• Who care’s what I have to say?
• Gaming leads the way
Limited uptake
Opening week:
Grand Theft Auto: $500M Halo 3: $300M Spiderman 3: $182M Pirates 3: $196M
© Arthur J. Lembo, Jr.
Salisbury University
© Arthur J. Lembo, Jr.
Salisbury University
© Arthur J. Lembo, Jr.
Salisbury University
Options for Large Geographic
Computations • Use a smaller dataset
• Generalize the resolution of your dataset
– Both options compromise the integrity of the data
• Invest in clusters (groups of ordinary PCs joined together with combined power and parallel processing) or time sharing
– Require special programming
– Very costly
© Arthur J. Lembo, Jr.
Salisbury University
Background
Parallel Processing - program
allows multiple computations to
occur concurrently
GPU - graphical processing unit,
generally used for video /
gaming visuals processing
Designed for multithreading,
contain hundreds of cores,
good at simple math
CPU - central processing unit,
what computation is
traditionally done on
Contain much smaller number
of cores, good at complex
calculations
© Arthur J. Lembo, Jr.
Salisbury University
Test Environment
GPU - Nvidia GTX 670
1344 CUDA cores
2 GB DDR5 RAM
Intel Xeon E5607
processor
4 Cores, 4 threads
2.27 GHz
© Arthur J. Lembo, Jr.
Salisbury University
Why PyCUDA
• Expose CUDA functions in
QGIS
• Easier to program?
• Easier to add functionality?
© Arthur J. Lembo, Jr.
Salisbury University
Terrain functions
• Started with 3 common GIS functions
• Slope - ~15 calculations, Aspect - ~20 calculations,
Hillshade - ~45 calculations
• All are embarrassingly parallel
© Arthur J. Lembo, Jr.
Salisbury University
Terrain visuals
Altitude Hillshade Slope
© Arthur J. Lembo, Jr.
Salisbury University
Methods, cont.
© Arthur J. Lembo, Jr.
Salisbury University
Scheduler
• Overall manager
• Starts and
manages
processes which
load data,
• Performs raster
calculations on
GPU,
• Save data back to
disk.
© Arthur J. Lembo, Jr.
Salisbury University
GPU
Calculator • Where the actual
calculations are
performed
• Reads data from the
input pipe, performs
GPU calculations
over all the cores,
sends result through
output pipe to data
saver
• Designed so any
algorithm based on
a 3x3 grid of pixels
can be used
© Arthur J. Lembo, Jr.
Salisbury University
Saver • Takes the results of
the calculations done
in the GPU manager
and saves them to
disk
• Uses the GDAL
libraries to write
multiple lines at a
time to a Geotiff
• Multiple savers can
all run in parallel to
save the ouputs of
different functions
© Arthur J. Lembo, Jr.
Salisbury University
Python by itself is much slower than C++
PyCUDA is faster both because of utilizing the GPU
and because it is written in C
*take away: when given the option, use pyCUDA
libraries
Results and Discussion
(Stage 1 - out of the box)
Size Python C++ PyCUDA QGIS
25 MB 50 secs 5 secs 4 secs 5 secs
200 MB 7:30 mins 40 secs 28 secs 15 secs
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion
(Stage 2 - threading)
Adding CPU based parallelism
increases gains
Reduces time waiting for data to be
given to GPU
Size Threaded
Python
Threaded
C++
Threaded
PyCUDA
QGIS
25 MB 45 secs 5 secs 4 secs 5 secs
200 MB 7:30 mins 40 secs 9 secs 15 secs
1.5 GB 1:30 hrs 18:03 mins 9:04 mins 9 mins
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion
(Stage 2 - added computation) Adding more complex computations allows us to
maximize the GPU contribution.
Computing hillshade which requires about 3x more
computations
PyCUDA doesn’t even slow down when switching
formulas
Shows that it can do much more before peaking out
Size QGIS Threaded
PyCUDA
25 MB 5 secs 4 secs
1.5 GB 11:00 mins 9:04 mins
12 GB 45:00 mins 50:00 mins
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion
(Stage 3 - further optimization) Main bottleneck in computations disk I/O
Total time the GPU is working for the 1.5 GB file is less than
2 seconds
Increasing size of reads and writes gains even more time
Size QGIS Threaded
PyCUDA
Input 2:00 mins 1:55 mins
Computation 9:00 mins 1:00 mins
Output 2:00 mins 2:20 mins
Total 11:00 mins 3:35 mins
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion (Stage
3 - chunking)
Lines read Time taken
1 50:00
10 39:50
15 28:30
20 33:30
30 37:00
40 40:00
50 48:30
Lines read Time taken
1 5:18
10 3:54
15 3:00
20 3:50
30 4:16
40 4:52
50 5:21
Reading too much data in one call causes slowdown Optimal number is ~15 lines for all sizes No apparent ratio between disk read lines and raster column and row sizes 1.5 GB is 14400 rows * 28800 cols 12 GB is 51187 rows * 60818 cols The limitation is how fast we can send data to GPU
12 GB file 1.5 GB file
© Arthur J. Lembo, Jr.
Salisbury University
Results • The PyCUDA version is
consistently faster than QGIS
when calculating hillshade for
files of various sizes.
• GPU computations, including
CPU based memory
management took one ninth of
the time required to do the same
thing in QGIS
• The I/O bottleneck can be seen in
the input and output sections of
the second table.
• Output takes a much longer time
because it has to wait for the
GPU to pass data to the saver
before it can start saving to disk
9:00
© Arthur J. Lembo, Jr.
Salisbury University
Other Takeaways
• CUDA is very efficient when you
have a smaller number of data
elements, but massive calculations
per element.
• Terrain based analysis use massive
amounts of data, but few calculations
per data element.
© Arthur J. Lembo, Jr.
Salisbury University
Earlier work
© Arthur J. Lembo, Jr.
Salisbury University
Next Steps
• Improve the installation process – it is
too arduous at the moment
• Get the plug-in to work in Windows
© Arthur J. Lembo, Jr.
Salisbury University
Conclusion
• Early results show the ability to triple terrain analysis
speed compared to serial methods
• Multithreading can significantly improve GIS
analysis speed
• Try it out for yourself:
https://github.com/aFuerst/PyCUDA-Raster
GPU
C++
QGIS
SERIAL
© Arthur J. Lembo, Jr.
Salisbury University
SO, WHAT IS A GOOD GIS
EXAMPLE OF MASSIVE
CALCULATIONS PER DATA
ELEMENT?
© Arthur J. Lembo, Jr.
Salisbury University
Acknowledgements
Salisbury University
National Science Foundation (Award #
1460900)
Students:
William Hoffman
Charlie Kazer
Alex Fuerst