How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse...

19
How fast is fast enough? SAP HANA in-memory technologies for Big Data Dmitry Shepelyavy, Platform Business Area Head, SAP CIS Oct 08, 2014

Transcript of How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse...

Page 1: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

How fast is fast enough?

SAP HANA in-memory technologies for Big Data

Dmitry Shepelyavy, Platform Business Area Head, SAP CIS

Oct 08, 2014

Page 2: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

How to turn new signals into business value?

:-) Brand

Sentiment

360O Customer

View

Product

Recommendation

Propensity

to Churn

Real-time Demand/

Supply Forecast

Predictive

Maintenance

Fraud

Detection

Network

Optimization

Insider

Threats

Risk Mitigation,

Real-time

Asset

Tracking Personalized

Care

Customer Data

Automobiles

Machine Data

Smart Meter

Point of Sale

Mobile

Structured Data

Click Stream

Social

Network

Location-

based Data

Text Data

IMHO, it’s great!

RFID

© 2014 SAP AG or an SAP affiliate company. All rights reserved. 2 Customer

Page 3: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 3

SAP HANA Platform for Big Data

Operational

Analytics Big Data Predictive, Spatial & Text

Analytics

REAL-TIME ANALYTICS

SAP HANA PLATFORM

Sense &

Respond Planning &

Optimization

Consumer

Engagement

REAL-TIME APPLICATIONS

SAP

BusinessSuite

StartUp

&

ISV Apps

HANA Apps,

Accelerators

& RDS Any Apps

DWH &

Datamarts

on HANA

SAP HANA Platform

Adm

inis

tratio

n

Extended Application Services

Integration Services

Deployment:

Database Services

Develo

pm

ent Processing Engine

Application Function Libraries & Data Models

On-Premise | Hybrid | On-Demand

Page 4: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 4

Application Services

Integration Services

Data Processing Simplified & Optimized with SAP HANA

• Fully ACID compliant, In-memory,

columnar, massively parallel

processing database platform

• Open Interfaces: SQL, ODBC, JDBC,

MDX, JSON, XML, …

• In-memory stored procedures and

Data virtualization with smart data

access

• Integrated data processing for end to

end analytic processing

Scan

5 billion billion integer/sec/core

12.5 million aggregates/sec /core

Ingest

1.5 million records/sec/node Deployment Service

OnDemand | Hybrid | OnPremise

Processing

Engine

SIMD OLTP + OLAP

MPP CPU Cache Aware Shared Nothing

In-Memory

Database Services

Event Processing Planning

Calculation Predictive Text Mining

Deplo

yment

Serv

ices

Adm

inis

tratio

n

Serv

ices

Rules Search Graph

Machine Learning Time Series Spatial GIS

SAP HANA PLATFORM

Page 5: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

SAP HANA Software & Hardware Architecture

CPU

STORAGE

MEMORY

Compression Row +

Columnar

OLTP+OLAP no

aggregate tables

SSD HDD

64bit address space 6 TB in current servers

Dramatic decline in

price/performance

L3

Cach

e

L3

Cach

e

L3

Cach

e

L3

Cach

e

L3

Cach

e

L3

Cach

e

L3

Cach

e

L3

Cach

e

Multi-Core Architecture 8 CPU x 15 Cores per node

Massive parallel scaling with many

blades

Logging and Backup

+ In database

algorithms

+ Apps

DB

Page 6: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2013 SAP AG. All rights reserved. 6

In-Memory database Combine OLTP, OLAP and HW acceleration

SAP HANA

easy-to-deploy, real-time,

simplified experience

Today

complex, duplicate, inconsistent

Eliminate unnecessary

complexity & latency

Less hardware to manage

Accelerate through

simplification + in-memory

Create new possibilities

Several copies of data

Different data models

Inherent data latency

Transact Analyze Accelerate Transactions

+ analysis

In-memory

acceleration

Page 7: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2013 SAP AG. All rights reserved. 7

In-Memory computing – More than a Database

Move data intense operations to the in-memory computing

Traditional applications

execute many data

intense operations in

the application layer

High performance apps

delegate data intense

operations to the

in-memory computing

In-Memory Computing Imperatives

Avoid movement of detailed data

Calculate first, then move results

Eliminate unnecessary process steps

Remove Latency

Page 8: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2013 SAP AG. All rights reserved. 8 Customer

SAP HANA - Simplifying Business Intelligence and Analytics

Page 9: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 9

The Big Data Challenge

9

PROCESS & STORE

ACT ACQUIRE ANALYZE

SAP: Big Data, Real-time, with Real Results

REAL RESULTS

REAL TIME

© 2014 SAP AG or an SAP affiliate company. All rights reserved. 9 Customer

Page 10: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 10

Different new types of data technologies

Big Data

Key-Value

Document

Graph

NewSQL Databases

Cloud Solutions

Hadoop

Returns a chunk of

data using a hash key

Key Value

Key-Value Store Graph Store

Relationships (between

nodes) are first class

citizens

{

"firstName": "John",

"lastName": "Smith",

"age": 25,

"address":

{

"streetAddress": "21 2nd Street",

"city": "New York",

"state": "NY",

"postalCode": "10021"

},

"phoneNumber":

[

{

"type": "home",

"number": "212 555-1234"

},

{

"type": "fax",

"number": "646 555-4567"

}

]

}

Document Store

Store hierarchical

documents rather than

rows

New SQL Databases

VoltDB

High performance by skipping

recovery, latching, locking and

buffer pools

Page 11: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 11

Different new types of data technologies

Big Data

Key-Value

e.g. Cassandra,

Hbase, SimpleDB, Voldemort

Document

Stores e.g.

Couchbase, CouchDB, MongoDB,

Graph

e.g. Neo4j, Giraph,

GraphBase, GraphLab,

Infinite Graph

NewSQL Databases

VoltDB, Starcounter

Cloud Solutions

e.g, Amazon SimpleDB

DynamoDB, Redshift

Hadoop

HDFS, Hive, Hbase, Pig,

Mahout,

Returns a chunk of

data using a hash key

Key Value

Key-Value Store Graph Store

Relationships (between

nodes) are first class

citizens

{

"firstName": "John",

"lastName": "Smith",

"age": 25,

"address":

{

"streetAddress": "21 2nd Street",

"city": "New York",

"state": "NY",

"postalCode": "10021"

},

"phoneNumber":

[

{

"type": "home",

"number": "212 555-1234"

},

{

"type": "fax",

"number": "646 555-4567"

}

]

}

Document Store

Store hierarchical

documents rather than

rows

New SQL Databases

VoltDB

High performance by skipping

recovery, latching, locking and

buffer pools

Page 12: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 12

Complexity of IT landscape Point optimization is not enough to meet the new frontiers of real-time business

Real-time

Business

Scenario

IMPACT ON BUSINESS Slow Response Times | Usability Challenges | Lack Of Adaptability

IMPACT ON IT High Latency | Complexity | High Cost of Solutions

Transactional

Datastore

Data

Warehouse Sensors

Data

Mobile

Data

Archives Social & Text Geo-Spatial

Location

Intelligence

Order

Processing

Operational

Reporting

Real-time Risk

& Fraud

Trend

Analysis

Sentiment

Analytics

Predictive

Analytics

Pattern

Recognition

Analyze

ETL

Staging

Collect

Clean-Data Quality

Transact

Aggregate

Summarize

Communicate

Monitor

Predict Planning

0

1

Product

Recommendation Predictive

Maintenance

Fraud

Detection Network

Optimization

Insider

Threats

Page 13: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 13

Any Apps Any App Server

SAP Business Suite and BW ABAP App Server

JSON R Open Connectivity MDX SQL

SAP HANA Platform – More than just a database

SAP HANA platform converges Database, Data Processing and Application Platform

capabilities & provides libraries for predictive, planning, text, spatial, and business

analytics so businesses can operate in real-time.

SAP HANA Platform

Unifie

d A

dm

inis

tratio

n

Life

-cycle

Managem

ent

Security

Extended Application Services

Integration Services

Deployment:

Database Services

Applic

ation

Develo

pm

ent

Pro

cess O

rchestr

ation

OLTP | OLAP | Search | Text Analysis |Predictive | Events | Spatial | Rules | Planning | Graph

Processing Engine

Application Function Libraries & Data Models

Predictive Analysis Libraries | Business Function Libraries | Data Models & Stored Procedures

Data Virtualization | Replication | ETL/ELT | Mobile Synch | Streaming

App Server| UI Integration Services | Web Server

On-Premise | Hybrid | On-Demand

Supports any Device

Page 14: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 14

SAP Event Stream Processor

Event Stream

Processor

(ESP)

?

INPUT

STREAMS/EVENTS

Event Data

Sensors

Alerts

Studio

(Authoring)

SAP HANA

Dashboard

Message

Bus

OUTPUT

STREAMS/EVENTS

Analytics

Applications

Business

Data

Integrate events & history

Extreme performance & scalability

output to applications, dashboards, devices,

messaging platforms

Page 15: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 15

SAP HANA - Spatial Engine

© 2014 SAP AG or an SAP affiliate company. All rights reserved. 15 Customer

SAP HANA Spatial Processing

Real-time Spatial Processing

High-performance algorithms analyze

massive amounts of spatial data in real-time

Mobility Visualization Analytics HTML 5 GIS Applications

Spatial Analytics Optimization

Columnar storage architecture eliminates need

to create spatial indexes, tessellation, or other

optimization techniques.

Geo-content & services

Maps, geo-content and geospatial services

open integration for seamless application

development and deployment

Spatial Data Types & Functions

Store, process, manipulate, share and

retrieve spatial data directly in the database

Business Data + Spatial Data + Real-time Data

Geo – Services

- Geocoding

- Base maps

Geo – Content

- Political

Boundaries

- POIs

- Roads

Columnar Spatial

Processing

Calc Model / Views

- Joins

- Views

Spatial Functions

- Area

- Distance

- Within

Spatial Data Types

- Points

- Lines

- Polygons

Transaction

Data Unstructured

Data

Location Data Machine

Data

Page 16: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 16

SAP HANA - Text Engine

Page 17: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

Predictive Analytics SAP HANA

Accelerate predictive analysis and scoring with in-database algorithms delivered

out-of-the-box. Adapt the models frequently

Execute R commands as part of overall query plan by transferring intermediate DB tables

directly to R as vector-oriented data structures

Predictive analytics across multiple data types and sources.

(e.g.: Unstructured Text, Geospatial, Hadoop)

C4.5

decision tree

Weighted score

tables

Regression

ABC

classification

Unstructured

PAL

R-scripts

SQL Script Optimized Query Plan

Main Memory

Spatial Data

R-Engine

KNN

classification

K-means

Associate

analysis:

market basket Text Analysis

SAP HANA

HANA Studio/AFM,

Apps & Tools

Page 18: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 18

SAP HANA Smart Data Access

Leverage remote compute

engines

Single development

environment

Heterogeneous data

sources

Hadoop (Hive)

SDK for adding support for

additional data sources

Query monitoring and

statistics

Performance and query

optimization

Transactions + Analytics

Teradata

Hadoop,

Hive SAP IQ

Oracle,

SQL Server

SDK for

Custom

Adapters

SAP HANA

Page 19: How fast is fast enough? SAP HANA in-memory technologies ... · 10/8/2014  · DataWarehouse Sensors Mobile Archives Social & Text Geo-Spatial Location ... Hadoop (Hive) SDK for adding

© 2014 SAP (Schweiz) AG. All rights reserved. 19

“If I had asked people what

they wanted, they would

have said faster horses.”

Henry Ford