An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME...

15
March 2016 An Introduction to ParStream Joerg Bienert

Transcript of An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME...

Page 1: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

March 2016

An Introduction to ParStream

Joerg Bienert

Page 2: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Imagine a World…Where IoT Analytics Delivers 15% More

Output from Renewable Energy Sources

30TBAnalyze Data

in Real-time

15%Increase

Efficiency

$158M/yrGenerate Operational/

Economic Benefits

(20,000 Wind Turbines; 10 GW Capacity; .3 Capacity Factor; $40/MW-hour)2

Page 3: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

ParStream Responds to Distinct Requirements for IoT

REAL-TIME INSIGHTSUse cases require near

Real Time Analytics

millisecond query

response time

DATA GROWTHData is growing faster and bigger

because of number of sensors

10B+ rows 5TB+

FAST DATAData streamed from sensors

requires fast ingestion

1M+ rows per sec

EDGE ANALYTICSIoT data is mostly generated

at the ‘Edges’ of the network

1000+ locations

Page 4: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

ParStream is uniquely positioned for Real-time Analytics in IoT

REAL-TIME IMPORT

REAL-TIME QUERYING

FLEXIBLEANALYTICS

Billio

ns o

f R

eco

rds

Thousands of Columns

Small Form Factor / Low TCO

Edge-Analytics

Page 5: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

ParStream is Integrated with Leading IoT Solutions

Custom Apps

Analytics

Visualization

Data Collection

DG Logic

Cisco ParStream

DG Logic

Page 6: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Choose Your Database Based on Your Use-case

< 1..10 ms

1 min

1 hr

10..100 ms

1 sec

10 min

4 hrs

Re

sp

on

se T

ime

Big Data

Massively Parallel (MPP)

Real-Time

HadoopOLTP

Reporting

In-Memory DB

Gigabyte Terabyte Petabyte

OLAP

Batch-Analytics

Real-Time IoT Analytics

Stream-Analytics

Operations

Analytics

Complex Event Processing

High

Low

Cisco ParStream

Cassandra

Page 7: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

SQL API / JDBC /ODBC

Real-Time Analytics Engine

In-Memory and

Disk Technology

Multi-Dimensional

Partitioning

Massively Parallel

Processing (MPP)

Shared Nothing

Architecture

3rd Generation Columnar Storage

High Speed Parallel Loader with Low Latency

C++ UDx API

ParStream’s Patented Technology Provides a Competitive Advantage

Lockless ArchitectureEnables ultra-fast query and

data import performance

Massive Parallel ProcessingDelivers linear scalability and

high query throughput

Small Footprint Enables analytics at the edge

with a low TCO

High Performance

Compressed IndexesProvide ultra-high query performance

High

Performance

Compressed

Index (HPCI)

2

3

4

1

Page 8: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

ParStream Has the Fastest Query Response Times

Environment: Single EC2 XL node with 15 GB RAM, 2 TB disk on Amazon AWS.

OTP data set with 150 Million records. Query set based on customer use-cases.

RedShift

1 second

10seconds

22seconds

31seconds

38seconds

98seconds

ParStream

Page 9: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Geo-Distributed setup substantially reduces network traffic enabling continuous monitoring (sampling proofed insufficient)

Database

More Than 20 Billion Records Returned

Query Search Results40 Records Found

4 Billion Records

Central Analytics Intelligence

QuerySearch Results

40 Records Found

7Records

Hybrid Edge / Centralized Intelligence

<100 Records

Overcoming Bandwidth Limitations and Reducing Report Delays Requires Analytics to be Pushed Closer to the Data Source

Edge Analytics Delivers Real-time Insights by Minimizing Network Traffic

4 Billion Records

4 Billion Records

4 Billion Records

4 Billion Records

18Records

5Records

12Records

8Records

ParStream ParStream ParStream ParStream ParStream

Application Application

ParStream Geo-Distributed Server ParStream

Page 10: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Customer Proof Point for IoT in Renewable Energy: Real-time Analytics for Wind Turbines

Business Challenge• Optimize wind turbine performance by quickly adjusting to changing

environmental factors (e.g., wind direction, temperature, etc.)

• Minimize turbine downtime thru predictive maintenance.

Use Case• Real-time and continuous monitoring of data from 20,000 wind turbines

(each with 150 sensors), including analysis of 30TB of historical data

ParStream‘s Technology Value Proposition• Real-time monitoring of continuous data-flow for immediate insights/actions

• Historical analysis thru enabling storage and analytics in an integrated

platform by immediately importing and storing readings from turbines.

Benefits/Results (estimated)• 15% improvement in productivity

• Decreased downtime

• $158M of annual economic benefits

Page 11: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Customer Proof Point for IoT in Automotive/Telematics: Usage-Based-Insurance (UBI) + Real-time Value-added Services

Business Challenge

• Optimization of multiple systems for efficiency and operational (automated)

decisions on billions of records

• Enabling new service-driven business models

Use Case

• Real-time monitoring of continuous GPS data and events flows

ParStream‘s Technology Value Proposition

• Over 260 million new records/month for real-time analytics

• 31 billion records of historical data

• ParStream collects all data from different systems near-real-time

Benefits/Results

• Reduced overall data manipulation time by over 90%

• Reduced annual hardware by over 60%

• Improved execution time and scheduling efforts

• Improved analysis/prediction of driver profiles

Page 12: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Customer Proof Point for IoT in Manufacturing: Real-time Analytics for Semiconductor Testing

Business Challenge• Current MySQL environment requires pre-built aggregations. The ability to perform

root cause analysis is limited.

• Computing aggregations takes too long reducing machine utilization and causing

more scrap product.

Use Case• One Automated Testing Equipment handles 24 wafers per lot, 1 wafer generates 1 Billions

test results. Data volume required pre-built aggregations which took too long to build

ParStream’s Technology Value Proposition• Real-Time monitoring of continuous data-flow for immediate insight /action

to reduce waste and increase outputs

• Unlimited scalability allows Galaxy to market to bigger semiconductor testing

and manufacturing companies

Benefits/Results• Improved Machine Utilization: Current batch style analysis of test data causes expensive

test machines to be underutilized

• Revenue Increase: Increased data volume opens new, more lucrative markets, ability

to sell to larger customers

• New Products: Drill down analysis to detail test results leads to new insights

• Cost Savings: Ability to analyze detail level data expected to produce new insights

in causes of test failures

Page 13: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

Customer Proof Point for IoT in Oil & Gas: Interactive Analytics of Bore-Well condition

Undisclosed Customer

Business Challenge

• Oil quality has a major impact on profit margins

• Oil quality heavily depends on bore well condition

• Rapidly identifying and analyzing changes in bore well condition is crucial

Use Case

• Determine impurifications and operational incidents in oil wells to mitigate

profit loss in oil & gas business

ParStream‘s Technology Value Proposition

• Import and store sensor data from geo-graphically distributed oil-wells

• Provide interactive analytics of current condition vs historical patterns

Benefits/Results

• Production pilot

Page 14: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

March 2016

Thank You

Questions ?

Joerg Bienert

Page 15: An Introduction to ParStream€¦ · ParStream Responds to Distinct Requirements for IoT REAL-TIME INSIGHTS Use cases require near Real Time Analytics millisecond query response time

INRA Discovers Meta-Genomic Indicators FasterINRA MetaGenoPolis (MGP) analyzes 17 billion records

INRA is the world leader inmeta-genomic research

Up to 50 million different bacteria are identified per stool sample

Sample size will grow by 100x over next 12 montha

Data volume will grow from 17 billion to 2 trillion records

Researchers analyze correlation of bacteria presence with illnesses

ParStream is used to interactively discover and analyze correlations

Confidential