Mao Now: 21st Century Perspectives on Mao and the Chinese Revolution
Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and...
Transcript of Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and...
![Page 1: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/1.jpg)
Functional Comparison and Performance Evaluation
Mao WeiWang, HuafengZhang, Tianlun
![Page 2: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/2.jpg)
Overview
Streaming Core
MISC
Performance Benchmark
2
Choose your weapon !
![Page 3: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/3.jpg)
![Page 4: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/4.jpg)
Spark Streaming
FlinkStormStorm
TridentGearpumpHeron
This is the critical part, as it affects many features
Micro-Batch
Checkpoint per Batch
Continuous Streaming
Checkpoint “per Batch”
Source Operator Sink
Acker
Source Operator Sink
JobManager/HDFS
id offset state str ack
Source Operator Sink
Driver
Storage Storage
job status
HDFS
id
4
offset state str
Continuous Streaming
Ack per Record
Storage
![Page 5: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/5.jpg)
5
Low Latency High Latency
High ThroughputLow Throughput
Continuous Streaming
Ack per Record
Micro-Batch
Checkpoint per Batch
Continuous Streaming
Checkpoint “per Batch”
High Overhead Low Overhead
Spark Streaming
FlinkStormStorm
TridentGearpumpHeron
![Page 6: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/6.jpg)
6
Delivery Guarantee
At least once Exactly once
• Ackers know about if a record is processed successfully or not. If it failed, replay it.
• There is no state consistency guarantee.
• State is persisted in durable storage
• Checkpoint is linked with state storage per Batch
Spark Streaming
FlinkStormStorm
TridentGearpumpHeron
![Page 7: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/7.jpg)
7
Native State Operator
Yes* Yes Yes
• Flink Java API: ValueState ListState ReduceState
• Flink Scala API: mapWithState
• Gearpump persistState
• Spark 1.5: updateStateByKey
• Spark 1.6: mapWithState
• Trident: persistentAggregate State
• Storm: KeyValueState
• Heron:X User Maintain
Spark Streaming
FlinkStormStorm
TridentGearpumpHeron
![Page 8: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/8.jpg)
8
Dynamic Load Balance & Recovery Speed
Source
exec
exec
exec
10s + 5s = 15s
5s Source
exec
exec
exec
10s
10s
10s10s + 5s = 15s
Spark Streaming
FlinkStormStorm
TridentGearpumpHeron
![Page 9: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/9.jpg)
9
![Page 10: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/10.jpg)
10
Compositional
• Highly customizable operator based on basic building blocks
• Manual topology definition and optimization
TopologyBuilder builder = new TopologyBuilder();builder.setSpout(“input", new RandomSentenceSpout(), 1);builder.setBolt("split", new SplitSentence(), 3).shuffleGrouping("spout");builder.setBolt("count", new WordCount(), 2).fieldsGrouping("split", new Fields("word"));
“foo, foo, bar” “foo”, “foo”, “bar” {“foo”: 2, “bar”: 1}
Spout Bolt Bolt
Storm
Gearpump
Heron
![Page 11: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/11.jpg)
Declarative
• Higher order function as operators (map, filter, mapWithState…)
• Logical plan optimization
DataStream<String> text = env.readTextFile(params.get("input"));DataStream<Tuple2<String, Integer>> counts = text.flatMap(new Tokenizer()).keyBy(0).sum(1);
“foo, foo, bar” “foo”, “foo”, “bar” {“foo”: 1, “foo”: 1, “bar”: 1}
11
{“foo”: 2, “bar”: 1}
Spark Streaming
Flink
Storm Trident
Gearpump
![Page 12: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/12.jpg)
Statistical
• Data scientist friendly
• Dynamic type
Python
lines = ssc.textFileStream(params.get("input"))words = lines.flatMap(lambda line: line.split(“,"))pairs = words.map(lambda word: (word, 1))counts = pairs.reduceByKey(lambda x, y: x + y)counts.saveAsTextFiles(params.get("output"))
Spark Streaming
Storm
R
12
lines <- textFile(sc, “input”)words <- flatMap(lines, function(line) {
strsplit(line, “ ”)[[1]]})
wordCount <- lapply(words, function(word) {list(word, 1L)
}counts <- reduceByKey(wordCount, “+”, 2L)
Heron*StructuredStreaming*
Storm*
![Page 13: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/13.jpg)
13
SQL
Spark Streaming
FlinkStorm
TridentCREATE EXTERNAL TABLE
ORDERS (ID INT PRIMARY KEY, UNIT_PRICE INT, QUANTITY INT)
LOCATION 'kafka://localhost:2181/brokers?topic=orders' TBLPROPERTIES '{...}}‘
INSERT INTO LARGE_ORDERS SELECT ID, UNIT_PRICE * QUANTITY
AS TOTAL FROM ORDERS WHERE UNIT_PRICE * QUANTITY > 50
bin/storm sql XXXX.sql
InputDStream.transform((rdd: RDD[Order], time: Time) => {
import sqlContext.implicits._rdd.toDF.registAsTempTableval SQL = "SELECT ID, UNIT_PRICE * QUANTITY
AS TOTAL FROM ORDERS WHERE UNIT_PRICE * QUANTITY > 50"
val largeOrderDF = sqlContext.sql(SQL)largeOrderDF.toRDD
})
Fusion Style Pure StyleStructured Streaming*
![Page 14: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/14.jpg)
Summary
Compositional Declarative Python/R SQL
X √ √ √
√ X √ NOT support aggregation,
windowing and joiningX √ X
√ √ X X
X √ XSupport select,
from, where, union
√ X √* X
Spark Streaming
Flink
Storm
Storm Trident
Gearpump
14
Heron
![Page 15: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/15.jpg)
15
![Page 16: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/16.jpg)
• Multi Tasks of Multi Applications on Single Process
16
JVM Process Connect
with local SM
Thread Thread
Task
Heron
Flink
• Single Task on Single Process
Thread Thread
Task Task
JVM Process
Thread Thread
Task Task
JVM Process
Thread
Task
task from application A task from application BTaskTask
JVM Process Connect
with local SM
Thread
Task
Thread
![Page 17: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/17.jpg)
17
• Multi Tasks of Single application on Single Process
o Single task on single thread
o Multi tasks on single thread Gearpump
Spark Streaming
StormStorm
Trident
Thread
Task
Thread
Task
Task
Task
Task
JVM Process
Thread Thread
Task Task
JVM Process
Thread Thread
Task Task
JVM Process
Thread
Task
Thread
Task
Thread
TaskTask
JVM Process
![Page 18: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/18.jpg)
● Window Support ● Out-of-order Processing ● Memory Management
● Resource Management ● Web UI ● Community Maturity
![Page 19: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/19.jpg)
Window Support
• Sliding Window
smaller than gap
session gap
t t
19
• Count Window
• Session Window
Sliding Window Count Window Session Window
√ X X*
√ √ X
√ √ X
√* X X
√ √ √
X X X
Spark Streaming
Flink
Storm
Storm Trident
Gearpump
Heron
![Page 20: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/20.jpg)
Out-of-order Processing
20
Processing Time Event Time Watermark
√ √* X*
√ √ √
√ X X
√ √ √
√ √ √
√ X X
Spark Streaming
Flink
Storm
Storm Trident
Gearpump
Heron
![Page 21: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/21.jpg)
Memory Management
JVM Manage Self Manage on-heap Self Manage off-heap
√ √* √*
√ √ √
√ X X
√ X X
√ X X
Spark Streaming
Flink
Storm
Gearpump
21
![Page 22: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/22.jpg)
Resource Management
Standalone YARN Mesos
√ √ √
√ √ √
√ √ √
√ √ X
√ √ X
√ √ √
Spark Streaming
Flink
Storm
Storm Trident
Gearpump
22
Heron
![Page 23: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/23.jpg)
23
Web UI
Submit Jobs
CancelJobs
InspectJobs
ShowStatistics
ShowInput Rate
CheckExceptions
InspectConfig
Alert
X √ √ √ √ √ √ X
X √ √ √ √* √ √ X
√ √ √ √ √* √ √ X
√ √ √ √ X √ √ X
X X √ √ √* √ √ X
Spark Streaming
Flink
Storm
Gearpump
![Page 24: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/24.jpg)
2161
237 161514
770
500
1000
1500
2000
2500
1 2 3 4 5
Past 3 Months Summary on JIRA
Series1 Series2
780
217
21
184 13010220 5 34 20
0
200
400
600
800
1000
1 2 3 4 5
Past 1 Months Summary on GitHub
Series1 Series2
24
Community Maturity
Initiation Time
Apache Top
Project
Contributors
2013 2014 926
2011 2014 219
2014 Incubator 21
2010 2015 208
2014 N/A 44
Spark Streaming
Flink
Storm
Gearpump
![Page 25: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/25.jpg)
HiBench 6.0
![Page 26: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/26.jpg)
• “Lazy Benchmarking”
• Simple test case infer practical use case
26
Test Philosophical
![Page 27: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/27.jpg)
Architecture
Test Cluster (Standalone)
Data Generator
Metrics ReaderFile System
Topic A
Kafka Broker
Kafka Broker
Kafka Broker
Client Master
Slave
20 Core80G Mem
Slave
20 Core80G Mem
Slave
20 Core80G Mem
Slave
20 Core80G Mem
Slave
20 Core80G Mem
Slave
20 Core80G Mem
Slave
20 Core80G Mem
Topic A
To
pic
B
Result
In Time
Out Time
Out Time – In Time
Spark Streaming
FlinkStorm Gearpump
27
![Page 28: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/28.jpg)
The SetupKafka Cluster
• CPU: 2 x Intel(R) Xeon(R) CPU E5-
2699 v3@ 2.30GHz
• Mem: 128 GB
• Disk: 8 x HDD (1TB)
• Network: 10 Gbps
10
Gb
ps
Test Cluster
• CPU: 2 x Intel(R) Xeon(R) CPU E5-
2697 v2@ 2.70GHz
• Core: 20 / 24
• Mem: 80 / 128 GB
• Disk: 8 x HDD (1TB )
• Network: 10 Gbps
x7
x3Name Version
Java 1.8
Scala 2.11.7
Hadoop 2.6.2
Zookeeper 3.4.8
Kafka 0.8.2.2
Spark 1.6.1
Storm 1.0.1
Flink 1.0.3
Gearpump 0.8.1
• Heron require specific Operation System (Ubuntu/CentOS/Mac OS)• Structured Streaming doesn’t support Kafka source yet (Spark 2.0)
28
![Page 29: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/29.jpg)
29
Raw Input Data
• Kafka Topic Partition: 140
• Size Per Message (configurable): 200 bytes
• Raw Input Message Example:
“0,227.209.164.46,nbizrgdziebsaecsecujfjcqtvnpcnxxwiopmddorcxnlijdizgoi,1991-06-10,0.115967035,Mozilla/5.0 (iPhone; U; CPU like Mac OS X)AppleWebKit/420.1 (KHTML like Gecko) Version/3.0 Mobile/4A93Safari/419.3,YEM,YEM-AR,snowdrops,1”
• Strong Type: class UserVisit (ip, sessionId, browser)
• Keep feeding data at specific rate for 5 minutes5 minutes
![Page 30: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/30.jpg)
30
Framework Configuration
Framework Related Configuration
7 Executor140 Parallelism
7 TaskManager140 Parallelism
28 Worker140 KafkaSpout
28 Executors140 KafkaSource
Spark Streaming
Flink
Storm
Gearpump
![Page 31: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/31.jpg)
31
Data Input Rate
Throughput Message/Second Kafka Producer Num
40KB/s 0.2K 1
400KB/s 2K 1
4MB/s 20K 1
40MB/s 200K 1
80MB/s 400K 1
400MB/s 2M 10
600MB/s 3M 15
800MB/s 4M 20
![Page 32: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/32.jpg)
32
![Page 33: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/33.jpg)
Test Case: Identity
The application reads input data from Kafka and then writes result to Kafka immediately, there is no complex business logic involved.
![Page 34: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/34.jpg)
34
Result
0
1000
2000
3000
4000
5000
6000
7000
8000
0 100 200 300 400 500 600 700 800
Input Rate (MB/s)
P99 Latency (ms)
Series1 Series2 Series3 Series4
![Page 35: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/35.jpg)
35
![Page 36: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/36.jpg)
36
Test Case: Repartition
Basically, this test case can stand for the efficiency of data shuffle.
Network Shuffle
![Page 37: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/37.jpg)
37
Result
0
50000
100000
150000
200000
250000
300000
350000
400000
0 200 400 600 800
Input Rate (MB/s)
P99 Latency (ms)
Series1 Series2 Series3
Series4 Series5
0
100
200
300
400
500
600
700
800
0 200 400 600 800
Input Rate (MB/s)
Throughput (MB/s)
Series1 Series2 Series3
Series4 Series5
![Page 38: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/38.jpg)
38
Observation
• Flink and Storm has close performance and are better choices to meet sub-second SLA requirement if no repartition happened.
• Spark Streaming need to schedule task with additional context. Under tiny batch interval case, the overhead could be dramatic worse compared to other frameworks.
• According to our test, minimum Batch Interval of Spark is about 80ms (140 tasks per batch), otherwise task schedule delay will keep increasing
• Repartition is heavy for every framework, but usually it’s unavoidable.
• Latency of Gearpump is still quite low even under 800MB/s input throughput.
![Page 39: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/39.jpg)
39
![Page 40: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/40.jpg)
40
Test Case: Stateful WordCount
Native state operator is supported by all frameworks we evaluated
Stateful operator performance + Checkpoint/Acker cost
![Page 41: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/41.jpg)
41
Result
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
0 200 400 600 800
Input Rate (MB/s)
P99 Latency (ms)
Series1 Series2 Series3
Series4 Series5
0
100
200
300
400
500
600
700
800
0 200 400 600 800
Input Rate (MB/s)
Throughput (MB/s)
Series1 Series2 Series3 Series4
![Page 42: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/42.jpg)
42
Observation
• Exactly-once semantics usually require state management and checkpoint. But better guarantees come at high cost.
• There is no obvious performance difference in Flink when switching fault tolerance on or off.
• Checkpoint mechanisms and storages play a critical role here.
![Page 43: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/43.jpg)
43
![Page 44: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/44.jpg)
44
Test Case: Window Based Aggregation
This test case manages a 10-seconds sliding window.
Latency = End2End – window.duration
![Page 45: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/45.jpg)
45
Result
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
0 200 400 600 800
Input Rate (MB/s)
P99 Latency (ms)
Series1 Series2 Series3
0
100
200
300
400
500
600
0 200 400 600 800
Input Rate (MB/s)
Throughput (MB/s)
Series1 Series2 Series3
![Page 46: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/46.jpg)
46
The native streaming execution model helps here
Observation
Spark Streaming
Flink
Storm
![Page 47: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/47.jpg)
47
![Page 48: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/48.jpg)
Do your own benchmark
HiBench : a cross platforms micro-benchmark suite for big data
(https://github.com/intel-hadoop/HiBench)
Open Source since 2012
Better streaming benchmark supporting will be included in next release [HiBench 6.0]
48
![Page 49: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/49.jpg)
![Page 50: Mao Wei Wang, Huafeng Zhang, Tianlun › sites › events... · Functional Comparison and Performance Evaluation Mao Wei Wang, Huafeng Zhang, Tianlun. Overview Streaming Core MISC](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed48f9d3d6f7d64f9067e26/html5/thumbnails/50.jpg)
Legal Disclaimer
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright ©2016 Intel Corporation.
50