Efficient Lightweight Fast Stream Processing at Scale*...Efficient Lightweight Fast Stream...

Post on 25-Apr-2020

9 views 0 download

Transcript of Efficient Lightweight Fast Stream Processing at Scale*...Efficient Lightweight Fast Stream...

ELF Efficient Lightweight Fast Stream Processing at Scale* Liting Hu Karsten Schwan Hrishikesh Amur

Xin Chen Georgia Institute of Technology

*Funded  in  part  with  support  from  the  Intel  Cloud  ISTC  

Outline

Motivation

ELF Design

Evaluation

#TopK

Micro-promotions

Product-bundling

Sales-prediction ... … …

Coupons Recommendations

Predictions

Applications

In

Filter @item

Sort #clicks

Out

Filter @buys

Out

Query Angry

Birds5?

⋈Yes/No?

Updates every half hour

Updates per min

Updates every x milliseconds

Batch model

Stream model

User query

Concurrent, diverse jobs with varied delays

Performance •  High throughput •  Low latency

Flexible •  Diverse jobs •  Cross-job coordination •  Runtime job change

Scalable •  Hundreds of concurrent jobs with shared inputs

Batch job

Stream job

Query job

ELF

System Needs

C  

Web server

Log collection systems e.g., Flume, Kafka

Storage

N jobs

M

Existing Solutions

Outline

Motivation

ELF Design

Evaluation

ELF runs directly on the web server tier

Master

5

hash(‘app2’) =d462ba

hash(‘app1’) =d502fb hash(`app3’)

=2080c3

Join Join

Master

Master

Join

Each organized into a peer-to-peer overlay

Many Masters/Many Workers

ELF Software Architecture

CBT-based agent `pre-reduces’ local key-value pairs

SRT globally `shuffles’ and `reduces’ key-value pairs

ELF APIs for programmers to implement diverse jobs

ELF Design

Sample ELF Query

in-memory data structure

<(a,1)(b,1)(a,1)(b,1)(b,1)>

<(a,2)(b,3)>

(1) ̀ insert ‘to fill

(2) ̀ flush’ to fetch the pre-reduced result

(3)‘empty’ to empty the states

Compressed Buffer Tree (CBT)

CBT-based agent `pre-reduces’ local key-value pairs

Stateful, asynchronous, and synchronous execution

SRT globally `shuffles’ and `reduces’ key-value pairs

ELF APIs for programmers to implement diverse jobs

ELF Design

CBT-based agent `pre-reduces’ local key-value pairs

Stateful, asynchronous, and synchronous execution

SRT globally `shuffles’ and `reduces’ key-value pairs

ELF APIs for programmers to implement diverse jobs

ELF Design

(a,2) (b,3)

(a,5)

(b,2) (c,2)

(c,1) (d,7)

(a,7) (b,3)

(d,7) (c,3) (b,2)

(a,7) (d,7) (b,5) (c,3)

hash(`micro-promotion’) =d462ba

(a,<b>) (b,<a>)

(a,<c>)

(c,<f>) (d,<e>)

(c,1) (d,7)

(a,<b,c>) (d,<e,g,h>)

(a,<b,c>) (b,<a>) (c,<f>) (d,<e,g,h>)

hash(`product-bundling’) =2080c3

(b,<a>) (c,<f>)

Per-job dataflow

webserver

hash(`micro-promotion’) =d462ba

hash(`product-bundling’) =2080c3

Synchronizing CBT to be emptied Last iteration’s result New job function or parameter

Per-job dataflow

hash(`micro-promotion’) =d462ba

hash(`product-bundling’) =2080c3

Limited to O(Log(N)) hops

hash(`Sale-predication’) =43bc3c

Cross-job Coordination

CBT-based agent `pre-reduces’ local key-value pairs

Stateful, asynchronous, and synchronous execution

SRT globally `shuffles’ and `reduces’ key-value pairs

Per-job dataflow, and cross-job coordination

ELF APIs for programmers to implement diverse jobs

ELF Design

CBT-based agent `pre-reduces’ local key-value pairs

Stateful, asynchronous, and synchronous execution

SRT globally `shuffles’ and `reduces’ key-value pairs

Per-job dataflow, and cross-job coordination

ELF APIs for programmers to implement diverse jobs

ELF Design

Using ELF

Outline

Motivation

ELF design

Evaluation

Latency Replay Twitter’s stream 1280 agents 50 events/second each

60 × 12-core 2.66 GHz AMD Opteron 48 GB RAM per server Gigabit Ethernet

ü  ELF outperforms for large windows due to the efficient CBT flush

New Functionality 10s millisecond for multicast messages 100s millisecond to update functions

Scalability 60 × 12-core 2.66 GHz AMD Opteron 48 GB RAM per server Gigabit Ethernet 1000 agents

ü  ELF scales well up to thousands of concurrent jobs

Conclusions and Future Work ELF introduces a `many masters/many workers’ framework: • To achieve good performance: low latency and

high throughput •  scalability: for diverse, concurrent jobs, and •  new functionality: job flexibility feedback and

coordination. Future work: understand the limitations of Elf; Explore more complex web-tier processing; look at task isolation.

Thank you!

Liting Hu foxting@gatech.edu