GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

Post on 12-Jan-2015

1.596 views 0 download

Tags:

description

An overview of Azavea's recent work to increase geoprocessing performance through distributed computing, cloud computing, GPUs and other techniques.

Transcript of GPUs, Cloud and Grids: Distributed Geoprocessing for Speed, Scalability and Better Living

GPUs, Clouds and Grids:Distributed Geoprocessing for Speed, Scalability and Better Living

17 February 2011

Robert Cheetham

NC GIS 2011

About Azavea

Founded in 2000

27 people– software engineers– spatial analysts– project managers

Web & Mobile apps

Spatial Analysis

R&D

High Performance Computing

User Experience

B Corporation

• 10% Research Program• Pro Bono Program• Time-to-Give-Back Program• Employee-focused Culture • Projects with Social Value

High Performance Geoprocessing

Classic GIS Use Case ...

Close to Center City Walk to Grocery Store

Nearby Restaurants Library

Near a Park Biking / walking distance from our work

Biking distance to fencing club

somewhat importantvitalvery importantnice to havesomewhat important very importantsomewhat important

Robert’s Rules of Housing

Child Care Local School Rankings Farmer's Market PhillyCarShare Public Transit

Your Factors might include…

Tax Incentives Commercial Corridor

Health Public Transit Car Share Open Space Farmers’ Markets Street Network Density Recycling Participation Walkability

Sustainability Factors

Not a new idea… Ian McHarg

Not a new idea … Design with Nature

Not a new idea … Map Algebra

Desktop GIS

x 5 x 2x 3x 1

+ ++

=

Generate Output Heat Map

The Web is different from the Desktop

Lots of simultaneous users

Stateless environment

HTML+JS+CSS

Users are less skilled

Users are less patient

ArcGIS Server

Flex, Silverlight and JS API’s

Publish tasks and models

Caching

Optimized MSD files

But wait … there’s a problem

10 – 60 second calculation time

Multiple simultaneous users …

… that are impatient

User Interface version 1

Reports

Reports

Sustainable Business Network

Walkability: Walkshed.org

NYC Big Apps Submission

Specific Optimization Goals

New Raster File format

Distributed processing

Binary messaging protocol

Optimization: File Format

Simple - strip out metadata

Limit data type and range

1D arrays are fast to read/write

Assume– Same extent– Same cell size– Same pixel data type– Same cell alignment– Same projection

Azavea Raster Grid (ARG)

Optimization: Distributed Processing

Parallelizable - Local Ops and Focal Ops

Support multiple– Threads– Cores– CPU’s– Machines

Considered– Hadoop– Amazon Map Reduce– Beowolf

Distributed Processing

Binary Messaging Protocol

Started with XML

Binary Protocol Buffer is better– simpler– 3 to 10 times smaller– 20 to 100 times faster– less ambiguous– a bit easier to use programmatically

Considered– AMF– Google Protocol Buffer

Success!!

Reduced from 10-60 seconds to

<500 milliseconds

Additional [Experimental] Measures

Tiling

Pyramids

EC2 for planned peaks – NYC Big Apps

HTTP file caching - Varnish

Optimizing one process sub-optimizes others

Complex to configure and maintain One type of operation No interpolation No mixing cell sizes No mixing extents No mixing projections No Map Algebra No ModelBuilder etc.

High Performance Geoprocessing 2.0

More generic

Cache data – memory is cheaper

New programming technology

OMB Watch: Federal Spending Equity

High Performance Geoprocessing 2.0

Reduced calculation time to

~40ms

GPU geoprocessing research

But wait, there’s more…

GPU geoprocessing research

GPU geoprocessing research

New languages CUDA OpenCL DirectCompute

Re-write every algorithm Hardware Diversity

Challenges

We re-wrote a few Map Algebra operations: Local Neighborhood Zonal Viewshed etc.

15 – 120x speed improvement Large grids Large neighborhoods

Results

Sea Level Rise

Crime Analysis, Early Warning and Forecasting

Hunch Helper

Risk Forecasting

Animation

Food, Culture and Sustainability

Quick Demo

http://commonspace.us

The Future

Clouds of Processors - Google App Engine

Faster is different

Summary

Challenges• Geographic data growth rates are exploding• Size of data sets is growing

• Lidar• Raster• Social Media

• New form factors that are less powerful• Distributed data sets• Larger numbers of less technical users

New Options• Clouds of processors• Clouds of virtual machines• GPUs

GPUs, Clouds and Grids:Distributed Geoprocessing for Speed, Scalability and Better Living

17 February 2011

Robert Cheetham

NC GIS 2011