Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End

33
WEBINAR Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End Dr. Dean Wampler (@deanwampler)

Transcript of Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End

WEBINAR

Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End

Dr. Dean Wampler (@deanwampler)

Upgrade your grey matterGet Dean’s free O’Reilly book from Lightbend

bit.ly/fastdata-ORbook

Streaming Engines in Context…

Classic Batch Architecture: Hadoop

Logs

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

YARN

ResourceManager

NodeManager

NM

Batch

MapReduce

Spark

Flume

SqoopDBs

Logs

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

YARN

ResourceManager

NodeManager

NM

Batch

MapReduce

Spark

Flume

SqoopDBs

Logs

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

YARN

ResourceManager

NodeManager

NM

Batch

MapReduce

Spark

Flume

SqoopDBs

Logs

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

YARN

ResourceManager

NodeManager

NM

Batch

MapReduce

Spark

Flume

SqoopDBs

Logs

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

YARN

ResourceManager

NodeManager

NM

Batch

MapReduce

Spark

Flume

SqoopDBs

New Streaming, “Fast Data” Architecture (but it also supports batch)

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Streaming Engines

Features to Consider

• Low latency? How low?• …

• Low latency? How low?• High volume? How high?• …

• Low latency? How low?• High volume? How high?• Integration with other tools? Which ones?• …

• Low latency? How low?• High volume? How high?• Integration with other tools? Which ones?• Kinds of data processing, analytics? Which ones?•Bulk processing of records?•Individual processing of events?

Example Streaming Engines

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Mesos, YARN, Cloud, …

Logs

Sockets

RESTZooKeeper Cluster

ZK

Mini-batch

SparkStreaming

Batch

Spark

Low Latency

Flink

Ka9aStreamsAkkaStreams

Gearpump Beam

Persistence

S3

HDFS

DiskDiskDisk

SQL/NoSQLSearch

1

5

6

311

KaEa Cluster

Ka9a

Microservices

RP Go

Node.js …

24

7

8

9

10

Beam

Upgrade your grey matterGet Dean’s free O’Reilly book from Lightbend

http://bit.ly/fastdata-ORbook

For more information on Lightbend Fast Data Platform:

lightbend.com/fast-data-platform