SpaceCurve - Integrating with Hadoop

19
© 2015 SpaceCurve, Inc. Confidential. | 1

Transcript of SpaceCurve - Integrating with Hadoop

Page 1: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 1!

Page 2: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 2!

Spatial DataHadoop EcosystemSpaceCurve’s Spatial Data PlatformIntegrating with Hadoop

Page 3: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 3!

Page 4: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 4!

•  Largest datasets are geospatial in nature– Daily generation of petabytes of data– Most is not used or simply discarded

•  Proliferation of mobile platforms, sensors and IoT– More geospatial data will be generated in real-time

•  Typical big data solutions can scale to ingest and store vast quantities of data– But these are not designed for real-time,

geospatial data

Page 5: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 5!

Devices > PeopleIn 2008, # of internet devices ���exceeded # of people on earth

20 - 50 BillionEstimated # of connected devices by 2020

80% of all datahas spatial attributes*

90% of all mobiledata is location aware*���*According to Gartner

Page 6: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 6!

ü Mobile Platformsü Operational Intelligence

ü Sensored World/Digital Businessü Context Rich Autonomous Systems 

ü Smart Machines/M2M

Source: Gartner Technology Trends 2015

Page 7: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 7!

THE WORLD IS A STATIC MAP

CAPTURING THE MOTION OF THINGS

���

REMOTE CONTROL OF THINGS

THINGS TALK TO EACH OTHER���

������������

THINGS BEHAVE INTELLIGENTLY

���������

Map coordinates of points of interest cataloged and described on the Internet.

Packages have passive sensors, we can track on web and know where they passed checkpoints.

UAVs used as remote sensing platforms for emergency response.

Aircraft optimize fuel consumption in real-time using data from internal and external sensor networks.

Large fleets of autonomous vehicles adapting to weather conditions and traffic congestion.EX

AM

PLE

S

Page 8: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 8!

Page 9: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 9!

•  Hadoop’s open source platform has become synonymous ���with big data processing

•  Core ecosystem:

–  Distributed file system for data storage (HDFS)

–  Distributed processing of data at scale (MapReduce)–  Batch-oriented job execution

•  Hadoop-based solutions excel at:

–  Ingesting and data warehousing multiple sources of data–  Creating and updating analytical dashboards on a weekly, daily or

hourly basis–  Providing insights from historical data that apply to future

scenarios

Page 10: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 10!

•  Hadoop ecosystem can scale to geospatial storage requirements•  HDFS not efficient for organizing and analyzing these data models as:

–  Geospatial data does not have a predictable, uniform distribution–  Hash functions can transform unpredictable, non-uniform

distributions do not preserve nor expose geospatial biases and relationships efficiently

•  Results:–  Reduction in parallelism and efficiency of geospatial analysis

–  Inability to implement computational geometry needed for geospatial analytics

Page 11: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 11!

Page 12: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 12!

CONTINUOUS HIGH-VELOCITY data ingestion rates are far beyond the limits of traditional spatial analysis platforms.

SPATIAL ANALYTICS required for high-value Internet of Everything ���

applications are not supportable on popular big data platforms.

REAL-TIME operational analysis requirements preclude the use ���

of batch-oriented platforms.

DATA VOLUME greatly exceeds capacity of platforms designed for real-time

analysis of human-generated sources.

Page 13: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 13!

•  SpaceCurve has created the first purpose-built ���platform from the ground up:–  Designed for organizing multiple streams of very large scale geospatial

data–  Optimized for analyzing data in real-time

–  Eliminates limitations on geospatial data inherent in other platforms

•  The SpaceCurve platform makes it possible to:

–  Collect and fuse multiple sources of data in real-time and immediately streaming it to an application

–  Allow continuous queries and analytics to be run with second and sub-second responses

–  Provide insights from real-time data that can apply to current, immediate scenarios

Page 14: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 14!

CONTINUOUS HIGH-VELOCITY

INGESTION

COMPLEX SPATIAL DATA TYPES & OPERATIONS

EXTREME DATA���VOLUMES

REAL-TIME QUERY EXECUTION &

ANALYSIS

Page 15: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 15!

Page 16: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 16!

•  Integration at the HDFS layer•  Enables all current systems and tools to be

utilized in their normal workflows•  Leverages existing investments and enables

real-time geospatial use cases

•  Build combined workflows that operate in parallel or where Hadoop components can call out queries into SpaceCurve

Page 17: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 17!

•  Additional resources can be found below:– Github – https://github.com/SpaceCurve/hadoop

•  This resource outlines the mechanics of export/import between SpaceCurve and Hadoop and includes a step-by-step tutorial using California earthquake data

– SpaceCurve VM – available upon request•  This resource lets a user install the SpaceCurve system

loaded with sample data and use SpaceCurve SQL to query the data

Page 18: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 18!

ESRI  Tools  

HDFS  

MapReduce  

Hive  

GeoJSON  

Mapper  

Reducer  

Hive  SQL  

SpaceCurve

HTTP/JSON  

Hadoop  Ecosystem  

Page 19: SpaceCurve - Integrating with Hadoop

© 2015 SpaceCurve, Inc. Confidential. | 19!