GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
-
Upload
azavea -
Category
Technology
-
view
1.596 -
download
0
description
Transcript of GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
GPUs, Clouds and Grids:Distributed Geoprocessing for Speed, Scalability and Better Living
17 February 2011
Robert Cheetham
NC GIS 2011
About Azavea
Founded in 2000
27 people– software engineers– spatial analysts– project managers
Web & Mobile apps
Spatial Analysis
R&D
High Performance Computing
User Experience
B Corporation
• 10% Research Program• Pro Bono Program• Time-to-Give-Back Program• Employee-focused Culture • Projects with Social Value
High Performance Geoprocessing
Classic GIS Use Case ...
Close to Center City Walk to Grocery Store
Nearby Restaurants Library
Near a Park Biking / walking distance from our work
Biking distance to fencing club
somewhat importantvitalvery importantnice to havesomewhat important very importantsomewhat important
Robert’s Rules of Housing
Child Care Local School Rankings Farmer's Market PhillyCarShare Public Transit
Your Factors might include…
Tax Incentives Commercial Corridor
Health Public Transit Car Share Open Space Farmers’ Markets Street Network Density Recycling Participation Walkability
Sustainability Factors
Not a new idea… Ian McHarg
Not a new idea … Design with Nature
Not a new idea … Map Algebra
Desktop GIS
x 5 x 2x 3x 1
+ ++
=
Generate Output Heat Map
The Web is different from the Desktop
Lots of simultaneous users
Stateless environment
HTML+JS+CSS
Users are less skilled
Users are less patient
ArcGIS Server
Flex, Silverlight and JS API’s
Publish tasks and models
Caching
Optimized MSD files
But wait … there’s a problem
10 – 60 second calculation time
Multiple simultaneous users …
… that are impatient
User Interface version 1
Reports
Reports
Sustainable Business Network
Walkability: Walkshed.org
NYC Big Apps Submission
Specific Optimization Goals
New Raster File format
Distributed processing
Binary messaging protocol
Optimization: File Format
Simple - strip out metadata
Limit data type and range
1D arrays are fast to read/write
Assume– Same extent– Same cell size– Same pixel data type– Same cell alignment– Same projection
Azavea Raster Grid (ARG)
Optimization: Distributed Processing
Parallelizable - Local Ops and Focal Ops
Support multiple– Threads– Cores– CPU’s– Machines
Considered– Hadoop– Amazon Map Reduce– Beowolf
Distributed Processing
Binary Messaging Protocol
Started with XML
Binary Protocol Buffer is better– simpler– 3 to 10 times smaller– 20 to 100 times faster– less ambiguous– a bit easier to use programmatically
Considered– AMF– Google Protocol Buffer
Success!!
Reduced from 10-60 seconds to
<500 milliseconds
Additional [Experimental] Measures
Tiling
Pyramids
EC2 for planned peaks – NYC Big Apps
HTTP file caching - Varnish
Optimizing one process sub-optimizes others
Complex to configure and maintain One type of operation No interpolation No mixing cell sizes No mixing extents No mixing projections No Map Algebra No ModelBuilder etc.
High Performance Geoprocessing 2.0
More generic
Cache data – memory is cheaper
New programming technology
OMB Watch: Federal Spending Equity
High Performance Geoprocessing 2.0
Reduced calculation time to
~40ms
GPU geoprocessing research
But wait, there’s more…
GPU geoprocessing research
GPU geoprocessing research
New languages CUDA OpenCL DirectCompute
Re-write every algorithm Hardware Diversity
Challenges
We re-wrote a few Map Algebra operations: Local Neighborhood Zonal Viewshed etc.
15 – 120x speed improvement Large grids Large neighborhoods
Results
Sea Level Rise
Crime Analysis, Early Warning and Forecasting
Hunch Helper
Risk Forecasting
Animation
Food, Culture and Sustainability
Quick Demo
http://commonspace.us
The Future
Clouds of Processors - Google App Engine
Faster is different
Summary
Challenges• Geographic data growth rates are exploding• Size of data sets is growing
• Lidar• Raster• Social Media
• New form factors that are less powerful• Distributed data sets• Larger numbers of less technical users
New Options• Clouds of processors• Clouds of virtual machines• GPUs
Many Thanks!© Photo used with permission from Alphafish, via Flickr.com
GPUs, Clouds and Grids:Distributed Geoprocessing for Speed, Scalability and Better Living
17 February 2011
Robert Cheetham
NC GIS 2011