Massive Streaming Analytics with Spark Streaming

16
Mattia Bertorello

Transcript of Massive Streaming Analytics with Spark Streaming

Mattia Bertorello

Who is Agile Lab?

GO BIG (data) or GO HOME

Summary

•Why streaming matters

•Why prediction?

•Streaming architecture

•Spark streaming

•Demo time

Why streaming

matters

Ⓒ2015 Agile Lab S.r.l.

Why streaming matters

Data Big Data

Business Reaction

Batch Analysis

Typical BigData Workflow

Why streaming matters

DataReal Time Processing

FASTER REACTIONS MORE PROFITS

Business Reaction

Streaming BigData Workflow

Why streaming matters

• Fleet Management• Insurance• Recommendation• Etc...

Why prediction?

Ⓒ2015 Agile Lab S.r.l.

Why prediction?

• Rule based categorization and clustering is obsolete

• Pattern discovery

• Adaptation to fast changing data

• Smart thinking: no dummies

• Prediction is more valuable

Streaming

architecture

Ⓒ2015 Agile Lab S.r.l.

Streaming architecture

Ingestion Layer

Processing Layer

Serving Layer

Spark Streaming

Spark Notebook

Use Apache Spark straight from the Browser

Demo time...

Ⓒ2015 Agile Lab S.r.l.

Card transaction analysis

PAN CIFRATO | AMOUNT | DESCRIPTION | TIMESTAMP

Classificazione delle transazioni

online/offline

PAN CIFRATO | AMOUNT | DESCRIPTION | TIMESTAMP | ISONLINE

fraud detection algorithm

SQL aggregation

Generazione di allarmi in tempo reale

We are

hiring...

Ⓒ2015 Agile Lab S.r.l.

[email protected]