GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

65
GPUs, Clouds and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living 17 February 2011 Robert Cheetham NC GIS 2011

description

An overview of Azavea's recent work to increase geoprocessing performance through distributed computing, cloud computing, GPUs and other techniques.

Transcript of GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Page 1: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

GPUs, Clouds and Grids:Distributed Geoprocessing for Speed, Scalability and Better Living

17 February 2011

Robert Cheetham

NC GIS 2011

Page 2: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

About Azavea

Founded in 2000

27 people– software engineers– spatial analysts– project managers

Web & Mobile apps

Spatial Analysis

R&D

High Performance Computing

User Experience

Page 3: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

B Corporation

• 10% Research Program• Pro Bono Program• Time-to-Give-Back Program• Employee-focused Culture • Projects with Social Value

Page 4: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

High Performance Geoprocessing

Page 5: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Classic GIS Use Case ...

Page 6: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Close to Center City Walk to Grocery Store

Nearby Restaurants Library

Near a Park Biking / walking distance from our work

Biking distance to fencing club

somewhat importantvitalvery importantnice to havesomewhat important very importantsomewhat important

Robert’s Rules of Housing

Page 7: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Child Care Local School Rankings Farmer's Market PhillyCarShare Public Transit

Your Factors might include…

Page 8: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Tax Incentives Commercial Corridor

Health Public Transit Car Share Open Space Farmers’ Markets Street Network Density Recycling Participation Walkability

Sustainability Factors

Page 9: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Not a new idea… Ian McHarg

Page 10: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Not a new idea … Design with Nature

Page 11: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Not a new idea … Map Algebra

Page 12: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Desktop GIS

Page 13: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

x 5 x 2x 3x 1

+ ++

=

Generate Output Heat Map

Page 14: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

The Web is different from the Desktop

Lots of simultaneous users

Stateless environment

HTML+JS+CSS

Users are less skilled

Users are less patient

Page 15: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

ArcGIS Server

Flex, Silverlight and JS API’s

Publish tasks and models

Caching

Optimized MSD files

Page 16: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

But wait … there’s a problem

10 – 60 second calculation time

Multiple simultaneous users …

… that are impatient

Page 17: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

User Interface version 1

Page 18: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 19: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 20: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 21: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 22: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 23: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 24: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Reports

Page 25: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Reports

Page 26: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 27: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Sustainable Business Network

Page 28: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 29: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 30: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 31: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 32: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 33: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 34: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Walkability: Walkshed.org

Page 35: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

NYC Big Apps Submission

Page 36: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Specific Optimization Goals

New Raster File format

Distributed processing

Binary messaging protocol

Page 37: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Optimization: File Format

Simple - strip out metadata

Limit data type and range

1D arrays are fast to read/write

Assume– Same extent– Same cell size– Same pixel data type– Same cell alignment– Same projection

Azavea Raster Grid (ARG)

Page 38: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Optimization: Distributed Processing

Parallelizable - Local Ops and Focal Ops

Support multiple– Threads– Cores– CPU’s– Machines

Considered– Hadoop– Amazon Map Reduce– Beowolf

Page 39: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Distributed Processing

Page 40: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Binary Messaging Protocol

Started with XML

Binary Protocol Buffer is better– simpler– 3 to 10 times smaller– 20 to 100 times faster– less ambiguous– a bit easier to use programmatically

Considered– AMF– Google Protocol Buffer

Page 41: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Success!!

Reduced from 10-60 seconds to

<500 milliseconds

Page 42: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Additional [Experimental] Measures

Tiling

Pyramids

EC2 for planned peaks – NYC Big Apps

HTTP file caching - Varnish

Page 43: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Optimizing one process sub-optimizes others

Complex to configure and maintain One type of operation No interpolation No mixing cell sizes No mixing extents No mixing projections No Map Algebra No ModelBuilder etc.

Page 44: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

High Performance Geoprocessing 2.0

More generic

Cache data – memory is cheaper

New programming technology

Page 45: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

OMB Watch: Federal Spending Equity

Page 46: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

High Performance Geoprocessing 2.0

Reduced calculation time to

~40ms

Page 47: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

GPU geoprocessing research

But wait, there’s more…

Page 48: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

GPU geoprocessing research

Page 49: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

GPU geoprocessing research

Page 50: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

New languages CUDA OpenCL DirectCompute

Re-write every algorithm Hardware Diversity

Challenges

Page 51: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

We re-wrote a few Map Algebra operations: Local Neighborhood Zonal Viewshed etc.

15 – 120x speed improvement Large grids Large neighborhoods

Results

Page 52: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Sea Level Rise

Page 53: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Crime Analysis, Early Warning and Forecasting

Page 54: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Hunch Helper

Page 55: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Risk Forecasting

Page 56: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Animation

Page 57: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 58: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 59: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living
Page 60: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Food, Culture and Sustainability

Page 61: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Quick Demo

http://commonspace.us

Page 62: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

The Future

Clouds of Processors - Google App Engine

Faster is different

Page 63: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Summary

Challenges• Geographic data growth rates are exploding• Size of data sets is growing

• Lidar• Raster• Social Media

• New form factors that are less powerful• Distributed data sets• Larger numbers of less technical users

New Options• Clouds of processors• Clouds of virtual machines• GPUs

Page 65: GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

GPUs, Clouds and Grids:Distributed Geoprocessing for Speed, Scalability and Better Living

17 February 2011

Robert Cheetham

NC GIS 2011