The GPU Accelerated Database - Nvidiaimages.nvidia.com/content/events/geoint-2015/... · 2U...

24
GIS Federal Proprietary Information GIS Federal Proprietary Information The GPU Accelerated Database 801 N Quincy St Suite 601 Arlington, VA 22203 Eli Glaser Principal Software Engineer [email protected]

Transcript of The GPU Accelerated Database - Nvidiaimages.nvidia.com/content/events/geoint-2015/... · 2U...

GIS Federal Proprietary InformationGIS Federal Proprietary Information

The GPU Accelerated Database

801 N Quincy StSuite 601Arlington, VA 22203

Eli GlaserPrincipal Software [email protected]

GIS Federal Proprietary Information

Established 2009

Initially focused on Geospatial support engineering for US Army INSCOM

~30 Employees

Certified 8(a) Small Disadvantaged Business

OpenPOWER Foundation Member

GIS Federal Proprietary Information

In-memory distributed database using GPUs for processing

Ultrafast ingest and analysis of billions of objects

Built in visualization

Full text search

GIS Federal Proprietary Information

Massive number of computational coresNVIDIA K80: 4992 cores

IBM Power8: 12 cores / 96 threads

Intel Xeon: 18 cores / 36 threads

Massive memory bandwidthNVIDIA K80: 480 GB/s

IBM Power8: 196 GB/s

Intel Xeon: 60-85 GB/s

GIS Federal Proprietary Information

Unfamiliar programming paradigmBut the tools and libraries are getting better and better

Memory ManagementOften the key to achieving high performance

Disk->RAM->vRAM (GPU)

Distributed GPU job coordination and scheduling

Aligning computational cores with the data

GIS Federal Proprietary Information

A big data object store and calculation engine that is accelerated with NVIDIA Graphical Processing Units (GPUs)

Enables big data analytics on the fly

Run data intensive algorithms on big data in sub-second time

Native geospatial object support (points, shapes, tracks) for visualization as an image or video

Full text search including wildcards

Leverage the power of GPUs without becoming a GPU programming expert

GIS Federal Proprietary Information

High Performance Computing with commodity hardware costs

Scalable from a single laptop to a large cluster

Order of magnitude performance gain compared to CPU based solutions

Order of magnitude power reduction savings

Order of magnitude (or more) cost savings

GIS Federal Proprietary Information

Organizations can spend 10s of millions of dollars (or more) on licensing fees for traditional databases

Typically licensed on a per-CPU-core basis

Need many CPU cores to support billions+ objects

GPUdb can provide much better price/performance

GIS Federal Proprietary Information

GIS Federal is a member of the OpenPOWER Foundation

GPUdb is fully integrated and optimized on OpenPOWERhardware and software

NVIDIA Tesla K80 tested and certified

IBM CAPI Large Scale Flash Memory Integration underway

NVIDIA NVLink hardware beta testers

GIS Federal Proprietary Information

Initial benchmarks show >20% performance improvement over comparable x86 servers on CPU-intensive tasks

Anticipating order-of-magnitude improvements via NVLinkbandwidth enhancements

16 GB/s → 80+ GB/s

GIS Federal Proprietary Information

Abstracts distributed GPU processing from software developers

Memory management

Cluster wide GPU job scheduling

Automatic sharding and indexing

Includes accelerated geospatial, temporal, financial and machine learning processing functions

Simple HTTP Rest-like API

Available API language wrappers: JavaScript, Java, Python, C++, C#

Trivial to add new language wrappers

SQL-like query language

GIS Federal Proprietary Information

Orders of magnitude faster than traditional relational and ‘NoSQL’ competitors

Particularly for queries that need to scan all the data (i.e. count, sum)

Reduced development costs for data scaling and data analytics

Vastly smaller power and space footprint for greater computational capability

GIS Federal Proprietary Information

US Army INSCOM ‘Red Disk’In-memory computational engine for all data with geospatial and/or temporal components

STANAG track data

Intelligence reports

Sensor feeds

Integration with Apache Accumulo including per-object access control

SGI UV2000 Shared-Memory Machine10 TB RAM

16 K40 GPUs

GIS Federal Proprietary Information

USPSIn production ingesting and processing billions of objects

Geospatial breadcrumbs of USPS carriers

Route visualization and editing

Triggers and alerts

Mail delivery optimization

Track mail pieces as they move from location to location

Multiple SGI UV2000s with 60+ Tesla K40s

GIS Federal Proprietary Information

IDC HPC User Forum Won IDC HPC Innovation Excellence Award at SC14

For the US Army and the intelligence community as a whole, GIS Federal developed an innovative approach to quickly filter, analyze, and visualize in near real time big data streams from hundreds of data providers with a particular emphasis on geospatial data. GIS Federal leveraged the highly parallel compute power of graphical processing units (GPUs) to conduct the data processing. The solution generated multimillion-dollar revenues while saving tens of millions of dollars.

http://www.idc.com/getdoc.jsp?containerId=prUS25250214

GIS Federal Proprietary Information

GPUdb is capable of ingesting network 'flow' data at very high speeds

Massive threading capability allows for computationally intensive deep packet processing analytics

Native IPv4 and IPv6 attribute types for advanced network-oriented query construction

GIS Federal Proprietary Information

GIS Federal Proprietary Information

Native understanding of geospatial objects including points, shapes, tracks

Shape processing: within, contains, intersection, etc

Convex hull

Track analytics

Includes a full embedded WMS server for easy integration with visual mapping frameworks

Google Earth / Google Maps

Cesium

OpenLayers

ESRI ArcGIS JS API

GIS Federal Proprietary Information

GIS Federal Proprietary Information

Use cell phone tracks to find when people might have been in contact with ‘patient zero’

GIS Federal Proprietary Information

5 node Cluster Single Node $1,008.00

1U used server from EBay – $869.00

2x Intel Xeon X5650 (6-core, 2.66 GHz)

72 GB RAM

3 TB HDD

1x NVIDIA GTX 750Ti GPU - $140.00

640 cores

2 GB vRAM

Maxwell Architecture

Able to query and render over 2 Billion Tweets in ~1 second

Total Price: about $5k

GIS Federal Proprietary Information

2 node Cluster Single Node

2U SuperMicro Server

2x Intel Xeon E5-2690 v3 (12-core, 2.60 GHz)

512 GB RAM

3 TB SSD

2x NVIDIA K80 GPU

2x2496 cores per card

2x12 GB vRAM per card

Kepler Architecture

Able to query and render 15+ Billion Tweets in ~1 second

Total Price: about $50k

GIS Federal Proprietary Information

GPUdb Homepage – http://www.gpudb.com

GPUdb Demo Site – http://www.gpudb.com/gaiademo

GPUdb Tutorial video - https://www.youtube.com/watch?v=CNK7Mr5h8k0

IDC HPC User Forum presentation - https://www.youtube.com/watch?v=fY6FUOsUZKY

IDC HPC Innovation Excellence Award - http://www.idc.com/getdoc.jsp?containerId=prUS25250214

Datanami GPU powered Terrorist Hunter Article - http://www.datanami.com/2014/10/08/gpu-powered-

terrorist-hunter-eyes-commercial-big-data-role/

SGI, NVIDIA, and GIS Federal INSCOM Article with UV2000 and 16 Tesla K40s -

http://www.sgi.com/company_info/newsroom/press_releases/2014/april/gis_federal.html

GIS Federal Proprietary InformationGIS Federal Proprietary Information

The GPU Accelerated Database