Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray...
Transcript of Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray...
![Page 1: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/1.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Zuse Institute Berlin
Big Data Analytics on Cray XC Series DataWarp usingHadoop, Spark and FlinkCUG2016
R. Schmidtke
May 12, 2016
![Page 2: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/2.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Zuse Institute Berlin
Big Data Analytics on Cray XC Series DataWarp usingHadoop, Spark and FlinkCUG2016
R. Schmidtke
May 12, 2016
Update May 12, 2016: How absence of DVS client caching can mess up yourresults in practice.
![Page 3: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/3.jpg)
z u s e i n s t i t u t e b e r l i n z i b
TDS at ZIB
Test & Development System: mostly exclusive usage.
16 XC30 compute nodes, 10-core IvyBridge Xeon, 32 GiBmemory.
8 DataWarp nodes, 2x1.6 TiB SSDs, very quiet, persistent &striped (8MiB) & scratch.
2 Lustre (80 OST/2.3 PiB, 48 OST/1.4 PiB), production us-age, no striping.
Perfect for Big Data!
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 1/12
![Page 4: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/4.jpg)
z u s e i n s t i t u t e b e r l i n z i b
TDS at ZIB
Test & Development System: mostly exclusive usage.
16 XC30 compute nodes, 10-core IvyBridge Xeon, 32 GiBmemory.
8 DataWarp nodes, 2x1.6 TiB SSDs, very quiet, persistent &striped (8MiB) & scratch.
2 Lustre (80 OST/2.3 PiB, 48 OST/1.4 PiB), production us-age, no striping.
Perfect for Big Data!
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 1/12
![Page 5: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/5.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Approach
Hadoop, Spark and Flink as common data processing engines on CCM.
TeraSort, Streaming and SQL Join as well understood big data applications.
: Robust but lots of I/O because of shuffle.
: Great scaling but many IOPS (as we’ve heard multiple times this weekalready, and will again in 10 minutes).
: Flink? Think Spark with support for true stream processing, off-heapmemory and support for iterations.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 2/12
![Page 6: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/6.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Approach
Hadoop, Spark and Flink as common data processing engines on CCM.
TeraSort, Streaming and SQL Join as well understood big data applications.
:
Robust but lots of I/O because of shuffle.
: Great scaling but many IOPS (as we’ve heard multiple times this weekalready, and will again in 10 minutes).
: Flink? Think Spark with support for true stream processing, off-heapmemory and support for iterations.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 2/12
![Page 7: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/7.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Approach
Hadoop, Spark and Flink as common data processing engines on CCM.
TeraSort, Streaming and SQL Join as well understood big data applications.
: Robust but lots of I/O because of shuffle.
:
Great scaling but many IOPS (as we’ve heard multiple times this weekalready, and will again in 10 minutes).
: Flink? Think Spark with support for true stream processing, off-heapmemory and support for iterations.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 2/12
![Page 8: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/8.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Approach
Hadoop, Spark and Flink as common data processing engines on CCM.
TeraSort, Streaming and SQL Join as well understood big data applications.
: Robust but lots of I/O because of shuffle.
: Great scaling but many IOPS (as we’ve heard multiple times this weekalready, and will again in 10 minutes).
:
Flink? Think Spark with support for true stream processing, off-heapmemory and support for iterations.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 2/12
![Page 9: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/9.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Approach
Hadoop, Spark and Flink as common data processing engines on CCM.
TeraSort, Streaming and SQL Join as well understood big data applications.
: Robust but lots of I/O because of shuffle.
: Great scaling but many IOPS (as we’ve heard multiple times this weekalready, and will again in 10 minutes).
: Flink?
Think Spark with support for true stream processing, off-heapmemory and support for iterations.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 2/12
![Page 10: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/10.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Approach
Hadoop, Spark and Flink as common data processing engines on CCM.
TeraSort, Streaming and SQL Join as well understood big data applications.
: Robust but lots of I/O because of shuffle.
: Great scaling but many IOPS (as we’ve heard multiple times this weekalready, and will again in 10 minutes).
: Flink? Think Spark with support for true stream processing, off-heapmemory and support for iterations.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 2/12
![Page 11: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/11.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Suddenly: RealityTuning with that many parameters (TeraSort/Streaming/SQL, YARN, HDFS,Hadoop/Spark/Flink on DataWarp/Lustre) quickly becomes a life task.
We’ll take you on a lightweight version of our journey top-down, let’s start withTeraSort on Hadoop and DataWarp (i.e. HDFS data and Hadoop temporarydirectories).
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
TeraSort Wall TimeTotal Map Wall Time
Total Reduce Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 3/12
![Page 12: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/12.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Suddenly: RealityTuning with that many parameters (TeraSort/Streaming/SQL, YARN, HDFS,Hadoop/Spark/Flink on DataWarp/Lustre) quickly becomes a life task.
We’ll take you on a lightweight version of our journey top-down, let’s start withTeraSort on Hadoop and DataWarp (i.e. HDFS data and Hadoop temporarydirectories).
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
TeraSort Wall TimeTotal Map Wall Time
Total Reduce Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 3/12
![Page 13: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/13.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Suddenly: RealityTuning with that many parameters (TeraSort/Streaming/SQL, YARN, HDFS,Hadoop/Spark/Flink on DataWarp/Lustre) quickly becomes a life task.
We’ll take you on a lightweight version of our journey top-down, let’s start withTeraSort on Hadoop and DataWarp (i.e. HDFS data and Hadoop temporarydirectories).
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
TeraSort Wall TimeTotal Map Wall Time
Total Reduce Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 3/12
Between 4h34m to 0h49m,around 30 MiB/s per-nodethroughput.(Lustre: 3h18m to 0h25m,around 50 MiB/s per-nodethroughput.)
![Page 14: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/14.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Is it I/O? Hadoop FS Counters?
0
500
1000
1500
2000
2500
2 6 10 14
Dat
a (G
iB)
No. of Worker Nodes
Data Read from FSData Written to FS
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 4/12
![Page 15: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/15.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Is it I/O? Hadoop FS Counters?
0
500
1000
1500
2000
2500
2 6 10 14
Dat
a (G
iB)
No. of Worker Nodes
Data Read from FSData Written to FS
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 4/12
Maybe? But looking atthe counters ...
![Page 16: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/16.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Is it I/O? Hadoop FS Counters?
0
500
1000
1500
2000
2500
2 6 10 14
Dat
a (G
iB)
No. of Worker Nodes
Data Read from FSData Written to FS
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 4/12
Maybe? But looking atthe counters ...
We should see at least 2TiB of read/write everyrun.
![Page 17: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/17.jpg)
z u s e i n s t i t u t e b e r l i n z i b
We must go deeper ...DVS & Lustre FS counters to the rescue!
1x106
1x107
1x108
1x109
1x1010
1x1011
1x1012
2 6 10 14 0
500
1000
1500
2000
2500
3000
3500
4000
Cou
nt
GiB
No. of Worker Nodes
DW ReadsDW Writes
Lustre Reads
Lustre WritesLustre Read DataLustre Write Data
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 5/12
![Page 18: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/18.jpg)
z u s e i n s t i t u t e b e r l i n z i b
We must go deeper ...DVS & Lustre FS counters to the rescue!
1x106
1x107
1x108
1x109
1x1010
1x1011
1x1012
2 6 10 14 0
500
1000
1500
2000
2500
3000
3500
4000
Cou
nt
GiB
No. of Worker Nodes
DW ReadsDW Writes
Lustre Reads
Lustre WritesLustre Read DataLustre Write Data
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 5/12
Aha! Between 2 and 3 TiBread/write, so apparentlyHadoop FS counters onlycount shuffle and spill.DVS counter issues:
• Total no. ofread/written bytes.
• Reported max.read/write sizes of 64KiB vs. calculated avg.read/write sizes 192KiB to 2 MiB.
No. of reads/writes DW vs.Lustre?
![Page 19: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/19.jpg)
z u s e i n s t i t u t e b e r l i n z i b
What about Spark?
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
SparkTeraSort Wall TimeKey Mapping Wall Time
Save Output Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 6/12
![Page 20: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/20.jpg)
z u s e i n s t i t u t e b e r l i n z i b
What about Spark?
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
SparkTeraSort Wall TimeKey Mapping Wall Time
Save Output Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 6/12
Fail completely on twonodes.
Between 9h36m and2h30m, 2x - 3x slowerthan Hadoop.(Lustre: 2h18m and0h25m, almost likeHadoop.)
![Page 21: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/21.jpg)
z u s e i n s t i t u t e b e r l i n z i b
What about Spark?
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
SparkTeraSort Wall TimeKey Mapping Wall Time
Save Output Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 6/12
Fail completely on twonodes.
Between 9h36m and2h30m, 2x - 3x slowerthan Hadoop.(Lustre: 2h18m and0h25m, almost likeHadoop.)
Bummer, but at least itscales better.
![Page 22: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/22.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Count the counters
1x106
1x107
1x108
1x109
1x1010
1x1011
1x1012
2 6 10 14 0
500
1000
1500
2000
2500
3000
3500
4000C
ount
GiB
No. of Worker Nodes
DW ReadsDW Writes
Lustre Reads
Lustre WritesLustre Read DataLustre Write Data
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 7/12
![Page 23: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/23.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Count the counters
1x106
1x107
1x108
1x109
1x1010
1x1011
1x1012
2 6 10 14 0
500
1000
1500
2000
2500
3000
3500
4000C
ount
GiB
No. of Worker Nodes
DW ReadsDW Writes
Lustre Reads
Lustre WritesLustre Read DataLustre Write Data
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 7/12
2x - 3x less dataread/written, 1 TiB eachis the minimum.
Same number of writes.
1000x the number ofreads.
That’s 100 bytes perread.
2.5x - 5x the number ofopens (not shown).
![Page 24: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/24.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Flink
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
FlinkTeraSort Wall TimeDataSource Wall Time
Partition Wall TimeSort Wall Time
DataSink Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 8/12
![Page 25: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/25.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Flink
100
1000
10000
100000
2 6 10 14
Exec
utio
n Ti
me
(s)
No. of Worker Nodes
FlinkTeraSort Wall TimeDataSource Wall Time
Partition Wall TimeSort Wall Time
DataSink Wall Time
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 8/12
Between 5h14m and0h37m.(Lustre: 5h12m and0h11m.)
At least it’s a bit faster,half of the time.
![Page 26: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/26.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Counting on Flink
1x106
1x107
1x108
1x109
1x1010
1x1011
1x1012
2 6 10 14 0
500
1000
1500
2000
2500
3000
3500
4000C
ount
GiB
No. of Worker Nodes
DW ReadsDW Writes
Lustre Reads
Lustre WritesLustre Read DataLustre Write Data
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 9/12
![Page 27: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/27.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Counting on Flink
1x106
1x107
1x108
1x109
1x1010
1x1011
1x1012
2 6 10 14 0
500
1000
1500
2000
2500
3000
3500
4000C
ount
GiB
No. of Worker Nodes
DW ReadsDW Writes
Lustre Reads
Lustre WritesLustre Read DataLustre Write Data
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 9/12
Very constant I/Oprofile.
Why 2 TiB of dataread/written? 1 TiBeach should be enough,see Spark.
Almost exactly same I/Ofor 14 nodes as Hadoop,so operators must bemore efficient.
![Page 28: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/28.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Fast-forward two more benchmarks
... Flink wins throughput during TeraSort, Hadoop comes in 2nd, Spark is 3rd.1
... Spark wins throughput during Streaming benchmarks1, Flink wins latency.
... Spark wins throughput during SQL1, Flink comes in 2nd2, Hadoop is 3rd.
... DataWarp configuration always loses to corresponding Lustre configuration,always.
1for the configurations it does not crash on2its Table API is still beta though
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 10/12
![Page 29: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/29.jpg)
z u s e i n s t i t u t e b e r l i n z i b
Conclusions ... well, experiences.
Small site, more disk spill than necessary, however this helps our file systemcomparison tests.
Absolute results are bad, relation between frameworks and file systems nonethelesssignificant:There are use cases for each framework, highly configuration dependent.Don’t use DataWarp without caching and small transfer sizes.
CCM can be difficult to work with.
R/W memory mapped files are not supported on DataWarp.
Spark fails to run successfully a surprising number of times.
IOR with 64 KiB reads/writes roughly agrees with Hadoop FS counters.
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 11/12
![Page 30: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/30.jpg)
z u s e i n s t i t u t e b e r l i n z i b
What we don’t yet know
Why are there more reads/writes on Lustre than on DataWarp?
Why do the DVS counters report inconsistent values in one case?
Where does Flink’s I/O come from?
How do IPC Rx/Tx bytes relate to actually read/received data?
When do we get DataWarp Stage 2?
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 12/12
![Page 31: Big Data Analytics on Cray XC Series DataWarp using Hadoop ...€¦ · Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG2016 R. Schmidtke May 12, 2016](https://reader030.fdocuments.us/reader030/viewer/2022040306/5ec583139f76806f70487b16/html5/thumbnails/31.jpg)
z u s e i n s t i t u t e b e r l i n z i b
What we don’t yet know
Why are there more reads/writes on Lustre than on DataWarp?
Why do the DVS counters report inconsistent values in one case?
Where does Flink’s I/O come from?
How do IPC Rx/Tx bytes relate to actually read/received data?
When do we get DataWarp Stage 2?
R. Schmidtke — Big Data on Cray XC & DataWarp — Slide 12/12