SparkStreaming, - UC Berkeley AMP Campampcamp.berkeley.edu/.../2012/...spark-streaming.pdf · •...

SparkStreaming Large scale near-‐realtime stream processing

Tathagata Das (TD) UC Berkeley UC BERKELEY

Motivation •  Many important applications must process large data streams at second-‐scale latencies – Site statistics, intrusion detection, spam filtering,...

•  Scaling these apps to 100s of nodes, require … – Fault-‐tolerance: for both crashes and stragglers – Efficiency: for being cost-‐effective

•  Also would like to have … – Simple programming model –  Integration with batch + ad hoc queries

Current streaming frameworks don’t meet

both goals together

Traditional Streaming Systems

•  Record-‐at-‐a-‐time processing model – Each node has mutable state – For each record, update state & send new records

mutable state

node 1

node 3

input records push

node 2 input records

Fault tolerance via replication or upstream backup

node 1 node 3

node 2

node 1’ node 3’

node 2’

synchronization

node 1 node 3

node 2

standby

Replication Upstream backup

Fast recovery, but 2x hardware cost

Only need 1 standby, but slow to recover

node 1 node 3

node 2

node 1’ node 3’

node 2’

synchronization

node 1 node 3

node 2

standby

Replication Upstream backup

Neither handle stragglers

Fault tolerance via replication or upstream backup

Observation

•  Batch processing models, like MapReduce, do provide fault tolerance efficiently – Divide job into deterministic tasks – Rerun failed/slow tasks in parallel on other nodes

•  Idea: run a streaming computation as a series of very small, deterministic batch jobs – Eg. Process stream of tweets in 1 sec batches – Same recovery schemes at smaller timescale

•  Try to make batch size as small as possible – Lower batch size à lower end-‐to-‐end latency

•  State between batches kept in memory – Deterministic stateful ops à fault-‐tolerance

Discretized Stream Processing

time = 0 - 1:

time = 1 - 2:

batch operations input

immutable dataset (stored reliably)

immutable dataset (output or state); stored in memory

as Spark RDD

input stream state stream

Fault Recovery

•  All dataset modeled as RDDs with dependency graph à fault-‐tolerant with full replication

•  Fault/straggler recovery is done in parallel on other nodes à fast recovery

input dataset

Fast recovery without the cost of full replication

output dataset

How Fast Can It Go?

•  Prototype can process 4 GB/s (40M records/s) of data on 100 nodes at sub-‐second latency

Max throughput with a given latency bound (1 or 2s)

0 50 100

TopKWords

0 50 100

1 sec 2 sec

Number of nodes Number of nodes

Comparison with Storm

•  Storm limited to 10,000 records/s/node

•  Also tried Apache S4: 7000 records/s/node

•  Commercial systems report O(100K)

100 1000 10000

Record Size (bytes)

Spark Storm

100 1000 10000

Record Size (bytes)

Spark Storm

How Fast Can It Recover?

•  Recovers from faults/stragglers within 1 sec

Failure Happens

0 15 30 45 60 75

Time (s)

Sliding WordCount on 10 nodes with 30s checkpoint interval

Programming Interface

•  A Discretized Stream or DStream is a sequence of RDDs – Represents a stream of data – API very similar to RDDs

•  DStreams can be created… – Either from live streaming data – Or by transforming other DStreams

DStream Operators

•  Transformations – Build new streams from existing streams – Existing RDD ops (map, etc) + new “stateful” ops

•  Output operators – Send data to outside world (save results to external storage, print to screen, etc)

Example 1 Count the words received every second

!words = createNetworkStream("http://...”) !

counts = words.count() !DStreams

transformation

time = 0 - 1:

time = 1 - 2:

time = 2 - 3:

words counts

Example 2 Count frequency of words received every second !

words = createNetworkStream("http://...”) !

ones = words.map(w => (w, 1)) !

freqs = ones.reduceByKey(_ + _) !

Scala function literal

time = 0 - 1:

time = 1 - 2:

time = 2 - 3:

words ones freqs

map reduce

Example 2: Full code // Create the context and set the batch size !val ssc = new SparkStreamContext("local", "test") !ssc.setBatchDuration(Seconds(1)) !!!// Process a network stream !val words = ssc.createNetworkStream("http://...") !val ones = words.map(w => (w, 1)) !val freqs = ones.reduceByKey(_ + _) !freqs.print() !!!// Start the stream computation !ssc.run!

Example 3 Count frequency of words received in last minute

ones = words.map(w => (w, 1)) !

freqs_60s = freqs.window(Seconds(60), Second(1)) !" .reduceByKey(_ + _) !

!window length window movement

sliding window operator

time = 0 - 1:

time = 1 - 2:

time = 2 - 3:

words ones

map reduce freqs_60s window reduce

freqs = ones.reduceByKeyAndWindow(_ + _, Seconds(60), Seconds(1)) !

Simpler window-‐based reduce

“Incremental” window operators words freqs freqs_60s

Aggregation function

words freqs freqs_60s

Invertible aggregation function

freqs = ones.reduceByKeyAndWindow(_ + _, Seconds(60), Seconds(1)) !

!freqs = ones.reduceByKeyAndWindow( !

" " " " "_ + _, _ - _, Seconds(60), Seconds(1)) !

Smarter window-‐based reduce

Output Operators

•  save: write results to any Hadoop-‐compatible storage system (e.g. HDFS, HBase)

•  foreachRDD: run a Spark function on each RDD

freqs.save(“hdfs://...”)

freqs.foreachRDD(freqsRDD => { !"// any Spark/Scala processing, maybe save to database !

Live + Batch + Interactive

•  Combining DStreams with historical datasets ! freqs.join(oldFreqs).map(...) !

•  Interactive queries on stream state from the Spark interpreter

! freqs.slice(“21:00”, “21:05”).topK(10)

One stack to rule them all

•  The promise of a unified data analytics stack – Write algorithms only once – Cuts complexity of maintaining separate stacks for live and batch processing

– Query live stream state instead of waiting for import

•  Feedback very exciting

•  Some recent experiences …

Implementation on Spark

•  Optimizations on current Spark – Optimized scheduling for < 100ms tasks – New block store with fast NIO communication – Pipelining of jobs from different time intervals

•  Changes already in dev branch of Spark on http://www.github.com/mesos/spark

•  An alpha will be released with Spark 0.6 soon

More Details

•  You can find more about SparkStreaming in our paper: http://tinyurl.com/dstreams

Thank you!

Fault Recovery

1.47 1.66 1.75

2.31 2.64

0.0 0.5 1.0 1.5 2.0 2.5 3.0

WordCount, 30s checkpoints

WordMax, no checkpoints

WordMax, 10s checkpoints

) Before Failure

At Time of Failure

0.79 0.66

1.09 1.08

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

WordCount Grep

No straggler

Straggler, with speculation

Failures:

Stragglers:

Interactive Ad-‐Hoc Queries

Related Work

•  Bulk incremental processing (CBP, Comet) –  Periodic (~5 min) batch jobs on Hadoop/Dryad –  On-‐disk, replicated FS for storage instead of RDDs

•  Hadoop Online –  Does not recover stateful ops or allow multi-‐stage jobs

•  Streaming databases –  Record-‐at-‐a-‐time processing, generally replication for FT

•  Approximate query processing, load shedding –  Do not support the loss of arbitrary nodes –  Different math because drop rate is known exactly

•  Parallel recovery (MapReduce, GFS, RAMCloud, etc)

Timing Considerations

•  D-‐streams group input into intervals based on when records arrive at the system

•  For apps that need to group by an “external” time and tolerate network delays, support: – Slack time: delay starting a batch for a short fixed time to give records a chance to arrive

– Application-‐level correction: e.g. give a result for time t at time t+1, then use later records to update incrementally at time t+5

SparkStreaming, - UC Berkeley AMP Campampcamp.berkeley.edu/.../2012/...spark-streaming.pdf · •...

Documents

Transcript of SparkStreaming, - UC Berkeley AMP Campampcamp.berkeley.edu/.../2012/...spark-streaming.pdf · •...

BLIZZARD SERIES - ILP · Calculated LDelivered Lumens 70 (TM-21) Hours BL 80W >100K >100K BL 120W >100K >100K BL 160W >100K >100K BLEXT 205W >100K >100K 12,884 lm 12,652 lm 17,410

PILATUS 100K detector system

100k 2016 CG160404CS4

MyCloud for $100k

Module 10 Expanding Beyond 100K - Amazon S3 · Module 10 Expanding Beyond 100K This Module The module presents tac;cs and informaon “beyond” the core business of 100K Ultra. They

Money Managemen Kontrak Size 100k

Planning Guide, 100K, 200K, 400K

nesovski.com · 250 mV/100k ohm . 250 mV/lOOk ohm 250 mV/100k ohm Input sensitivity for IV output voltage Phono Aux Tuner Tape 2.5 mV/50k ohm 250 mV/100k ohm 250 mV/100k ohm . 250

MIT $100K Competition

C1 10n R4 100k R3 100k - TI.com · C1 10n R3 100k V Vout_ina + VM1 A + Iout Vout_opa p k + +-R1 R1-R2 U2 INA326_AK k R4 100k +-+ U1 OPA333 + Vin 100m Figure 5: TINA-TI ...

100k Students odyssey NEWS

Seamless streaming opinion paper - Prisa Digitalboletines.prisadigital.com/paper-seamless-streaming.pdf · Social media is the predominant driver of network traffic at events. Attendees

4242Manual 100k PDF

5k 100k presentation

NDUS Salaries Over $100K 06.11

$100K Final Pitch Deck

Machine Learning on Spark - UC Berkeley AMP Campampcamp.berkeley.edu/.../Machine-Learning-on-Spark... · Machine Learning on Spark Shivaram Venkataraman ... Machine learning algorithms

Optic Nerve Hypoplasia - NMSBVI Home · Optic Nerve Hypoplasia Prevalence 1970 30 cases 1980 1.8/100K 2/100K 1990 2000 6.3/100K 10.1/100K 7.2/100K 2010 Leading cause of pediatric

philco1772mxpacifictv.ca/schematics/philco1772mxmanual.pdf · 00 002 00 o O no A8SC 18E9r-E 100K 100K s

MIT $100K Example Pitch Deck