Enterprise Grade Streaming under 2ms on Hadoop

28
Enterprise Grade Streaming Under 2ms On Hadoop @vijaysbhat

Transcript of Enterprise Grade Streaming under 2ms on Hadoop

Page 1: Enterprise Grade Streaming under 2ms on Hadoop

Enterprise Grade Streaming Under 2ms On Hadoop

@vijaysbhat

Page 2: Enterprise Grade Streaming under 2ms on Hadoop

2

Page 3: Enterprise Grade Streaming under 2ms on Hadoop

3

VS.

Page 4: Enterprise Grade Streaming under 2ms on Hadoop

4

Page 5: Enterprise Grade Streaming under 2ms on Hadoop

5

Page 6: Enterprise Grade Streaming under 2ms on Hadoop

6

Page 7: Enterprise Grade Streaming under 2ms on Hadoop

7

X (predictor)Spend amount, geo

Y (response)

Simple Velocity Advanced

Page 8: Enterprise Grade Streaming under 2ms on Hadoop

8

Page 9: Enterprise Grade Streaming under 2ms on Hadoop

9

Page 10: Enterprise Grade Streaming under 2ms on Hadoop

10

Page 11: Enterprise Grade Streaming under 2ms on Hadoop

11

Hard Metrics Goal

Latency < 40msIdeally < 16ms

Throughput Goal of 2000 events / second

Durability No loss, every message gets exactly one response

Availability 99.5% uptime (downtime of 1.83 days / year);Ideally 99.999% uptime (downtime of 5.26 minutes / year)

Scalability Can add resources, still meet latency requirements

Integration Transparently connected to existing systems – Hardware, Messaging, HDFS

Soft Metrics Goal

Open Source All components licensed as open source

Extensibility Rules can be updated, model is regularly refreshed

Page 12: Enterprise Grade Streaming under 2ms on Hadoop

12

Page 13: Enterprise Grade Streaming under 2ms on Hadoop

13

Onyx

Page 14: Enterprise Grade Streaming under 2ms on Hadoop

14

Enterprise Readiness

RoadmapPerformance

Community

Page 15: Enterprise Grade Streaming under 2ms on Hadoop

15

Page 16: Enterprise Grade Streaming under 2ms on Hadoop

16

Page 17: Enterprise Grade Streaming under 2ms on Hadoop

17

Page 18: Enterprise Grade Streaming under 2ms on Hadoop

18

Page 19: Enterprise Grade Streaming under 2ms on Hadoop

19

Page 20: Enterprise Grade Streaming under 2ms on Hadoop

20

Page 21: Enterprise Grade Streaming under 2ms on Hadoop

21

YARN

Page 22: Enterprise Grade Streaming under 2ms on Hadoop

22

Page 23: Enterprise Grade Streaming under 2ms on Hadoop

23

Page 24: Enterprise Grade Streaming under 2ms on Hadoop

24

Failure Handling

Page 25: Enterprise Grade Streaming under 2ms on Hadoop

25

Page 26: Enterprise Grade Streaming under 2ms on Hadoop

26

• Avg. 0.25ms, @70k records/sec, w/ 600GB RAM

Thread Local on ~54M eventsPercentiles (in ms)

Throughput CountAvg

(ms) 90% 95% 99% 99.9% 4 9’s 5 9’s 6 9’s

70k/sec54,126,122 0.19 1 1 1 2 2 5 6

Performance

Page 27: Enterprise Grade Streaming under 2ms on Hadoop

27

Durability

• Two physically independent pipelines on the same cluster processing identical data

• For the same tuple, we find the best-case time between two pipelines– 39 records out of 5.2M exceeded 16ms – 173 out of 5.2M exceeded 16ms in one pipeline but succeeded in the other

• 99.99925% success rate – “Five Nines”•Average Latency of 0.0981ms

Page 28: Enterprise Grade Streaming under 2ms on Hadoop

28

@vijaysbhatlinkedin.com/in/vijaysbhat