Flink 0.10 @ Bay Area Meetup (October 2015)
-
Upload
stephan-ewen -
Category
Software
-
view
995 -
download
2
Transcript of Flink 0.10 @ Bay Area Meetup (October 2015)
Stephan Ewen, Kostas TzoumasFlink committers
co-founders @ data Artisans@StephanEwen, @kostas_tzoumas
Flink-0.10
What is Flink
2
Gelly
Tabl
e
ML
SAM
OA
DataSet (Java/Scala) DataStream (Java/Scala)
Hado
op M
/R
Local Remote Yarn Tez Embedded
Data
flow
Data
flow
(WiP
)
MRQ
L
Tabl
e
Casc
adin
g (W
iP)
Streaming dataflow runtime
Flink 0.10 Summary
Focus on operational readiness• high availability• monitoring• integration with other systems
First-class support for event time
Refined DataStream API: easy and powerful
3
Improved DataStream API Stream data analysis differs from batch data analysis
by introducing time
Streams are unbounded and produce data over time
Simple as batch API if handling time in a simple way
Powerful if you want to handle time in an advanced way (out-of-order records,preliminary results, etc)
4
Improved DataStream API
5
case class Event(location: Location, numVehicles: Long)
val stream: DataStream[Event] = …;
stream .filter { evt => isIntersection(evt.location) }
Improved DataStream API
6
case class Event(location: Location, numVehicles: Long)
val stream: DataStream[Event] = …;
stream .filter { evt => isIntersection(evt.location) } .keyBy("location") .timeWindow(Time.of(15, MINUTES), Time.of(5, MINUTES)) .sum("numVehicles")
Improved DataStream API
7
case class Event(location: Location, numVehicles: Long)
val stream: DataStream[Event] = …;
stream .filter { evt => isIntersection(evt.location) } .keyBy("location") .timeWindow(Time.of(15, MINUTES), Time.of(5, MINUTES)) .trigger(new Threshold(200)) .sum("numVehicles")
Improved DataStream API
8
case class Event(location: Location, numVehicles: Long)
val stream: DataStream[Event] = …;
stream .filter { evt => isIntersection(evt.location) } .keyBy("location") .timeWindow(Time.of(15, MINUTES), Time.of(5, MINUTES)) .trigger(new Threshold(200)) .sum("numVehicles")
.keyBy( evt => evt.location.grid ) .mapWithState { (evt, state: Option[Model]) => { val model = state.orElse(new Model()) (model.classify(evt), Some(model.update(evt))) }}
IoT / Mobile Applications
9
Events occur on devices
Queue / Log
Events analyzed in a
data streaming system
Stream Analysis
Events stored in a log
IoT / Mobile Applications
10
IoT / Mobile Applications
11
IoT / Mobile Applications
12
IoT / Mobile Applications
13
Out of order !!!
First burst of eventsSecond burst of events
IoT / Mobile Applications
14
Event time windows
Arrival time windows
Instant event-at-a-time
Flink supports out of order time (event time) windows,arrival time windows (and mixtures) plus low latency processing.
First burst of eventsSecond burst of events
15
We need aGlobal Clockthat runs onevent time instead of processing time.
16
This is a source
This is our window operator
1
0
0
0 0
1
2
1
2
1
1
This is the current event-time time
2
2
2
2
2
This is a watermark.
17
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(ProcessingTime);
DataStream<Tweet> text = env.addSource(new TwitterSrc());
DataStream<Tuple2<String, Integer>> counts = text .flatMap(new ExtractHashtags()) .keyBy(“name”) .timeWindow(Time.of(5, MINUTES) .apply(new HashtagCounter());
Processing Time
18
Event TimeStreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(EventTime);
DataStream<Tweet> text = env.addSource(new TwitterSrc());text = text.assignTimestamps(new MyTimestampExtractor());
DataStream<Tuple2<String, Integer>> counts = text .flatMap(new ExtractHashtags()) .keyBy(“name”) .timeWindow(Time.of(5, MINUTES) .apply(new HashtagCounter());
32-35
24-27
20-23
8-110-3
4-7
19Tumbling Windows of 4 Seconds
123412
4
59
9 0
20
20
22212326323321
26
353642
20
High Availability and Consistency
21
No Single-Point-Of-Failureany more
Exactly-once processing semanticsacross pipeline
Checkpoints/Fault Tolerance is decoupled from windows Allows for highly flexible window implementations
ZooKeeperensemble
MultipleMasters
failover
Operator State
Stateless operators System state
User defined state
22
ds.filter(_ != 0)
ds.keyBy("id").timeWindow(Time.of(5, SECONDS)).reduce(…)
public class CounterSum implements RichReduceFunction<Long> { private OperatorState<Long> counter;
@Override public Long reduce(Long v1, Long v2) throws Exception { counter.update(counter.value() + 1); return v1 + v2; }
@Override public void open(Configuration config) { counter = getRuntimeContext().getOperatorState(“counter”, 0L, false); }}
Checkpoints
Consistent snapshots of distributed data stream and operator state
23
Barriers
Markers for checkpoints Injected in the data flow
24
25
Alignment for multi-input operators
High Availability and Consistency
26
No Single-Point-Of-Failureany more
Exactly-once processing semanticsacross pipeline
Checkpoints/Fault Tolerance is decoupled from windows Allows for highly flexible window implementations
ZooKeeperensemble
MultipleMasters
failover
Without high availability
27
JobManager
TaskManager
With high availability
28
JobManager
TaskManager
Stand-byJobManager
Apache Zookeeper™
KEEP GOING
Persisting jobs
29
JobManager
Client
TaskManagers
Apache Zookeeper™
Job
1. Submit job
Persisting jobs
30
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Submit job2. Persist execution graph
Persisting jobs
31
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Submit job2. Persist execution graph3. Write handle to ZooKeeper
Persisting jobs
32
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Submit job2. Persist execution graph3. Write handle to ZooKeeper4. Deploy tasks
Handling checkpoints
33
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots
Handling checkpoints
34
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots2. Persist snapshots3. Send handles to JM
Handling checkpoints
35
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots2. Persist snapshots3. Send handles to JM4. Create global checkpoint
Handling checkpoints
36
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots2. Persist snapshots3. Send handles to JM4. Create global checkpoint5. Persist global checkpoint
Handling checkpoints
37
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots2. Persist snapshots3. Send handles to JM4. Create global checkpoint5. Persist global checkpoint6. Write handle to ZooKeeper
Performance
38
Continuousstreaming
Latency-boundbuffering
DistributedSnapshots
High Throughput &Low Latency
With configurable throughput/latency tradeoff
Batch and Streaming
39
case class WordCount(word: String, count: Int)
val text: DataStream[String] = …;
text .flatMap { line => line.split(" ") } .map { word => new WordCount(word, 1) } .keyBy("word") .window(GlobalWindows.create()) .trigger(new EOFTrigger()) .sum("count")
Batch Word Count in the DataStream API
Batch and Streaming
40
Batch Word Count in the DataSet API
case class WordCount(word: String, count: Int)
val text: DataStream[String] = …;
text .flatMap { line => line.split(" ") } .map { word => new WordCount(word, 1) } .keyBy("word") .window(GlobalWindows.create()) .trigger(new EOFTrigger()) .sum("count")
val text: DataSet[String] = …;
text .flatMap { line => line.split(" ") } .map { word => new WordCount(word, 1) } .groupBy("word") .sum("count")
Batch and Streaming
41
Pipelined andblocking operators Streaming Dataflow Runtime
Batch Parameters
DataSet DataStream
RelationalOptimizer
WindowOptimization
Pipelined andwindowed operators
Schedule lazilySchedule eagerly
Recompute wholeoperators Periodic checkpoints
Streaming data movement
Stateful operations
DAG recoveryFully buffered streams DAG resource management
Streaming Parameters
Batch and Streaming
42
A full-fledged batch processor as well
Batch and Streaming
43
A full-fledged batch processor as well
More details at Dongwon Kim's Talk"A comparative performance evaluation of Flink"
Integration (picture not complete)
44
POSIX Java/ScalaCollections
POSIX
Monitoring
45
Life system metrics anduser-defined accumulators/statistics
Get http://flink-m:8081/jobs/7684be6004e4e955c2a558a9bc463f65/accumulators
Monitoring REST API for custom monitoring tools
{ "id": "dceafe2df1f57a1206fcb907cb38ad97", "user-accumulators": [ { "name":"avglen", "type":"DoubleCounter", "value":"123.03259440000001" }, { "name":"genwords", "type":"LongCounter", "value":"75000000" } ] }
Flink 0.10 Summary
Focus on operational readiness• high availability• monitoring• integration with other systems
First-class support for event time
Refined DataStream API: easy and powerful
46