GraphX - USENIX · GraphX:! Graph Processing in a ! Distributed Dataflow Framework "Joseph...
Transcript of GraphX - USENIX · GraphX:! Graph Processing in a ! Distributed Dataflow Framework "Joseph...
GraphX:���Graph Processing in a ���Distributed Dataflow Framework Joseph Gonzalez Postdoc, UC-Berkeley AMPLab Co-founder, GraphLab Inc. ���Joint work with Reynold Xin, Ankur Dave, Daniel Crankshaw, Michael Franklin, and Ion Stoica OSDI 2014 UC BERKELEY
Modern Analytics
Raw Wikipedia
< / >!< / >!< / >!XML!
Hyperlinks PageRank Top 20 Pages Title PR
Link Table Title Link
Editor Graph Community Detection
User Community
User Com.
Discussion Table
User Disc.
Top Communities Com. PR..
Tables
Raw Wikipedia
< / >!< / >!< / >!XML!
Hyperlinks PageRank Top 20 Pages Title PR
Link Table Title Link
Editor Graph Community Detection
User Community
User Com.
Discussion Table
User Disc.
Top Communities Com. PR..
Graphs
Raw Wikipedia
< / >!< / >!< / >!XML!
Hyperlinks PageRank Top 20 Pages Title PR
Link Table Title Link
Editor Graph Community Detection
User Community
User Com.
Discussion Table
User Disc.
Top Communities Com. PR..
Separate Systems
Tables Graphs
Separate Systems
Graphs Dataflow Systems
Table
Result
Row
Row
Row
Row
Separate Systems Dataflow Systems Graph Systems
Dependency Graph
6. Before
8. After
7. After
Table
Result
Row
Row
Row
Row
Difficult to Use Users must Learn, Deploy, and Manage
multiple systems
Leads to brittle and often ���complex interfaces
8
Inefficient
9
Extensive data movement and duplication across ���the network and file system
< / >!< / >!< / >!XML!
HDFS HDFS HDFS HDFS
Limited reuse internal data-structures ���across stages
Spark Dataflow Framework
GraphX
GraphX Unifies Computation on ��� Tables and Graphs
Table View Graph View
Enabling a single system to easily and efficiently support the entire pipeline
Separate Systems Dataflow Systems Graph Systems
Dependency Graph
6. Before
8. After
7. After
Table
Result
Row
Row
Row
Row
?
Separate Systems Dataflow Systems
Dependency Graph
Graph Systems 6. Before
8. After
7. After
Table
Result
Row
Row
Row
Row
22
354
1340
0 200 400 600 800 1000 1200 1400 1600
GraphLab
Spark
Hadoop
Runtime (in seconds, PageRank for 10 iterations)
PageRank on the Live-Journal Graph
Hadoop is 60x slower than GraphLab Spark is 16x slower than GraphLab
?
Key Question
How can we naturally express and efficiently execute graph computation in a general purpose dataflow framework?
Distill the lessons learned���from specialized graph systems
Key Question
How can we naturally express and efficiently execute graph computation in a general purpose dataflow framework?
Representation Optimizations
Raw Wikipedia
< / >!< / >!< / >!XML!
Hyperlinks PageRank Top 20 Pages Title PR
Link Table Title Link
Editor Graph Community Detection
User Community
User Com.
Discussion Table
User Disc.
Top Communities Com. PR..
Express computation locally:
Iterate until convergence
Rank of Page i Weighted sum of
neighbors’ ranks
17
Example Computation: PageRank
Random Reset Prob.
R[i] = 0.15 +X
j2InLinks(i)
R[j]
OutLinks(j)
“Think like a Vertex.” - Malewicz et al., SIGMOD’10
18
19
Gather information from���neighboring vertices
Graph-Parallel Pattern Gonzalez et al. [OSDI’12]
20
Apply an update the vertex property
Graph-Parallel Pattern Gonzalez et al. [OSDI’12]
21
Scatter information to���neighboring vertices
Graph-Parallel Pattern Gonzalez et al. [OSDI’12]
Many Graph-Parallel Algorithms Collaborative Filtering » Alternating Least Squares » Stochastic Gradient Descent » Tensor Factorization
Structured Prediction » Loopy Belief Propagation » Max-Product Linear Programs » Gibbs Sampling
Semi-supervised ML » Graph SSL » CoEM
Community Detection » Triangle-Counting » K-core Decomposition » K-Truss
Graph Analytics » PageRank » Personalized PageRank » Shortest Path » Graph Coloring
MACHINE LEARNING
NETWORK ANALYSIS
22
Specialized Computational
Pattern
Specialized Graph
Optimizations
Graph System Optimizations
24
Specialized���Data-Structures
Vertex-Cuts Partitioning
Remote Caching / Mirroring
Message Combiners Active Set Tracking
Representation
Optimizations
Distributed Graphs
Horizontally Partitioned Tables
Join
Vertex Programs
Dataflow Operators
Advances in Graph Processing Systems Distributed Join Optimization
Materialized View Maintenance
Property Graph Data Model
B C
A D
F E
A D D
Property Graph
B C
D
E
A A
F
Vertex Property: • User Profile • Current PageRank Value
Edge Property: • Weights • Timestamps
Part. 2
Part. 1
Vertex Table
(RDD)
B C
A D
F E
A D
Encoding Property Graphs as Tables
D
Property Graph
B C
D
E
A A
F
Machine 1
Machine 2
Edge Table (RDD)
A B
A C
C D
B C
A E
A F
E F
E D
B
C
D
E
A
F
Routing Table
(RDD)
B
C
D
E
A
F
1
2
1 2
1 2
1
2
Vertex Cut
Separate Properties and Structure Reuse structural information across multiple graphs
Input Graph
Transform Vertex Properties
Transformed Graph Vertex Table
Routing ���Table
Edge Table
Vertex Table
Routing ���Table
Edge Table
Table Operators Table operators are inherited from Spark:
29
map
filter
groupBy
sort
union
join
leftOuterJoin
rightOuterJoin
reduce
count
fold
reduceByKey
groupByKey
cogroup
cross
zip
sample
take
first
partitionBy
mapWith
pipe
save
...
class Graph [ V, E ] { def Graph(vertices: Table[ (Id, V) ], edges: Table[ (Id, Id, E) ])
// Table Views ----------------- def vertices: Table[ (Id, V) ] def edges: Table[ (Id, Id, E) ] def triplets: Table [ ((Id, V), (Id, V), E) ] // Transformations ------------------------------ def reverse: Graph[V, E] def subgraph(pV: (Id, V) => Boolean,
pE: Edge[V,E] => Boolean): Graph[V,E] def mapV(m: (Id, V) => T ): Graph[T,E] def mapE(m: Edge[V,E] => T ): Graph[V,T] // Joins ---------------------------------------- def joinV(tbl: Table [(Id, T)]): Graph[(V, T), E ] def joinE(tbl: Table [(Id, Id, T)]): Graph[V, (E, T)] // Computation ---------------------------------- def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E]
}
Graph Operators (Scala)
30
class Graph [ V, E ] { def Graph(vertices: Table[ (Id, V) ], edges: Table[ (Id, Id, E) ])
// Table Views ----------------- def vertices: Table[ (Id, V) ] def edges: Table[ (Id, Id, E) ] def triplets: Table [ ((Id, V), (Id, V), E) ] // Transformations ------------------------------ def reverse: Graph[V, E] def subgraph(pV: (Id, V) => Boolean,
pE: Edge[V,E] => Boolean): Graph[V,E] def mapV(m: (Id, V) => T ): Graph[T,E] def mapE(m: Edge[V,E] => T ): Graph[V,T] // Joins ---------------------------------------- def joinV(tbl: Table [(Id, T)]): Graph[(V, T), E ] def joinE(tbl: Table [(Id, Id, T)]): Graph[V, (E, T)] // Computation ---------------------------------- def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E]
}
Graph Operators (Scala)
31
def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E]
capture the Gather-Scatter pattern from specialized graph-processing systems
Triplets Join Vertices and Edges The triplets operator joins vertices and edges:
Triplets Vertices
B
A
C
D
Edges
A B A C B C C D
A B A
B A C B C C D
Map-Reduce Triplets Map-Reduce triplets collects information about the neighborhood of each vertex:
C D
A C
B C
A B
Src. or Dst.
MapFunction( ) à (B, )
MapFunction( ) à (C, ) MapFunction( ) à (C, )
MapFunction( ) à (D, )
Reduce
(B, )
(C, + )
(D, ) Message
Combiners
Using these basic GraphX operators ���we implemented Pregel and GraphLab ���
in under 50 lines of code!
34
The GraphX Stack���(Lines of Code)
GraphX (2,500)
Spark (30,000)
Pregel API (34)
PageRank (20)
Connected Comp. (20)
K-core (60) Triangle
Count (50)
LDA (220)
SVD++ (110)
Some algorithms are more naturally expressed using the GraphX primitive operators
Representation
Optimizations
Distributed Graphs
Horizontally Partitioned Tables
Join
Vertex Programs
Dataflow Operators
Advances in Graph Processing Systems Distributed Join Optimization
Materialized View Maintenance
Vertex Table
(RDD)
Join Site Selection using Routing Tables Edge Table
(RDD) A B
A C
C D
B C
A E
A F
E F
E D
B
C
D
E
A
F
Routing Table
(RDD)
B
C
D
E
A
F
1
2
1 2
1 2
1
2
Never Shuffle���Edges!
Vertex Table
(RDD)
Caching for Iterative mrTriplets Edge Table
(RDD) A B
A C
C D
B C
A E
A F
E F
E D
Mirror Cache
B C D
A
Mirror Cache
D E F
A
B
C
D
E
A
F
B
C
D
E
A
F
A
D
Scan Scan
Reusable Hash Index
Reusable Hash Index
Vertex Table
(RDD)
Edge Table (RDD)
A B
A C
C D
B C
A E
A F
E F
E D
Mirror Cache
B C D
A
Mirror Cache
D E F
A
Incremental Updates for Triplets View
B
C
D
E
A
F
Change A A
Change E
Scan
Vertex Table
(RDD)
Edge Table (RDD)
A B
A C
C D
B C
A E
A F
E F
E D
Mirror Cache
B C D
A
Mirror Cache
D E F
A
Aggregation for Iterative mrTriplets
B
C
D
E
A
F
Change
Change
Scan
Change
Change
Change
Change
Local Aggregate
Local Aggregate
B C
D
F
Reduction in Communication Due to Cached Updates
0.1
1
10
100
1000
10000
0 2 4 6 8 10 12 14 16
Net
wor
k Co
mm
. (M
B)
Iteration
Connected Components on Twitter Graph
Most vertices are within 8 hops���of all vertices in their comp.
Benefit of Indexing Active Vertices
0
5
10
15
20
25
30
0 2 4 6 8 10 12 14 16
Runt
ime
(Sec
onds
)
Iteration
Connected Components on Twitter Graph
Without Active Tracking
Active Vertex Tracking
Join Elimination Identify and bypass joins for unused triplet fields
» Java bytecode inspection
43
0 2000 4000 6000 8000
10000 12000 14000
0 5 10 15 20
Com
mun
icatio
n (M
B)
Iteration
PageRank on Twitter
Factor of 2 reduction in communication
Better
Join Elimination
Without Join Elimination
Additional Optimizations Indexing and Bitmaps: » To accelerate joins across graphs » To efficiently construct sub-graphs
Lineage based fault-tolerance » Exploits Spark lineage to recover in parallel » Eliminates need for costly check-points
Substantial Index and Data Reuse: » Reuse routing tables across graphs and sub-graphs » Reuse edge adjacency information and indices
44
System Comparison Goal:
Demonstrate that GraphX achieves performance parity with specialized graph-processing systems.
Setup: 16 node EC2 Cluster (m2.4xLarge) + 1GigE Compare against GraphLab/PowerGraph (C++), Giraph (Java), & Spark (Java/Scala)
0 500
1000 1500 2000 2500 3000 3500
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)
PageRank Benchmark
GraphX performs comparably to ���state-of-the-art graph processing systems.
Runt
ime
(Sec
onds
)
EC2 Cluster of 16 x m2.4xLarge (8 cores) + 1GigE
7x 18x
Connected Comp. Benchmark
0
500
1000
1500
2000
2500
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)
GraphX performs comparably to ���state-of-the-art graph processing systems.
Out
-of-M
emor
y
EC2 Cluster of 16 x m2.4xLarge (8 cores) + 1GigE
Runt
ime
(Sec
onds
)
8x 10x
Graphs are just one stage…. ������
What about a pipeline?
HDFS HDFS
Compute Spark Preprocess Spark Post.
A Small Pipeline in GraphX
Timed end-to-end GraphX is the fastest
Raw Wikipedia
< / >!< / >!< / >!XML!
Hyperlinks PageRank Top 20 Pages
0 200 400 600 800 1000 1200 1400 1600
GraphX GraphLab + Spark
Giraph + Spark Spark
Total Runtime (in Seconds)
605 375
1492
342
Adoption and Impact GraphX is now part of Apache Spark
• Part of Cloudera Hadoop Distribution
In production at Alibaba Taobao
• Order of magnitude gains over Spark
Inspired GraphLab Inc. SFrame technology
• Unifies Tables & Graphs on Disk
GraphX à Unified Tables and Graphs
Enabling users to easily and efficiently express the entire analytics pipeline
New API Blurs the distinction between
Tables and Graphs
New System Unifies Data-Parallel ���
Graph-Parallel Systems
Graph Systems GraphX
Specialized Systems Integrated Frameworks
What did we Learn?
Graph Systems GraphX
Specialized Systems Integrated Frameworks
Parameter Server
?
Future Work
Graph Systems GraphX
Specialized Systems Integrated Frameworks
Parameter Server
Future Work
Asynchrony Non-deterministic
Shared-State
Thank You
http://amplab.cs.berkeley.edu/projects/graphx/
Reynold Xin
Ankur Dave
Daniel Crankshaw
Michael Franklin
Ion Stoica
Related Work Specialized Graph-Processing Systems: ���
GraphLab [UAI’10], Pregel [SIGMOD’10], Signal-Collect [ISWC’10], Combinatorial BLAS [IJHPCA’11], GraphChi [OSDI’12], PowerGraph [OSDI’12], ���Ligra [PPoPP’13], X-Stream [SOSP’13]
Alternative to Dataflow framework:��� Naiad [SOSP’13]: GraphLINQ ��� Hyracks: Pregelix [VLDB’15]
Distributed Join Optimization:��� Multicast Join [Afrati et al., EDBT’10]��� Semi-Join in MapReduce [Blanas et al., SIGMOD’10]
Edge Files Have Locality
0
200
400
600
800
1000
1200
GraphLab GraphX + Shuffle
GraphX
GraphLab rebalances the edge-files on-load.
GraphX preserves the on-disk layout through Spark. à Better Vertex-Cut
UK-Graph (106M Vertices, 3.7B Edges)
Runtime (Seconds)
Scalability
0
50
100
150
200
250
300
350
400
450
500
8 16 24 32 40 48 56 64
Linear Scaling
Twitter Graph (42M Vertices,1.5B Edges)
Scales slightly better than ���PowerGraph/GraphLab
Runt
ime
EC2-Nodes
Apache Spark Dataflow Platform Resilient Distributed Datasets (RDD):
Zaharia et al., NSDI’12
HDFS
HDFS
Map
Map
RDD RDD
Reduce
RDD
Load
Load
Apache Spark Dataflow Platform Resilient Distributed Datasets (RDD):
Zaharia et al., NSDI’12
HDFS
HDFS
Map
Map
RDD RDD
Reduce
Optimized for iterative access to data.
RDD
Load
Load
.cache()
Persist in Memory
PageRank Benchmark
GraphX performs comparably to ���state-of-the-art graph processing systems.
0 500
1000 1500 2000 2500 3000 3500
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)
? ✓ Runt
ime
(Sec
onds
)
EC2 Cluster of 16 x m2.4xLarge Nodes + 1GigE
?
Shared Memory Advantage Spark Shared Nothing Model
Core Core Core Core
Shuffle Files
GraphLab Shared Memory
Core Core Core Core
Shared De-serialized In-Memory���Graph
Shared Memory Advantage
0 50
100 150 200 250 300 350 400 450 500
GraphLab GraphLab NoSHM
GraphX
Twitter Graph (42M Vertices,1.5B Edges) Spark Shared Nothing Model
GraphLab No SHM.
Core Core Core Core
Shuffle Files
Core Core Core Core
TCP/IP
Runtime (Seconds)
PageRank Benchmark
0
500
1000
1500
2000
2500
3000
3500
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)
GraphX performs comparably to ���state-of-the-art graph processing systems.
Connected Comp. Benchmark
0
500
1000
1500
2000
2500
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)
GraphX performs comparably to ���state-of-the-art graph processing systems.
Out
-of-M
emor
y
EC2 Cluster of 16 x m2.4xLarge Nodes + 1GigE
Runt
ime
(Sec
onds
)
8x
10x
Out
-of-M
emor
y
Fault-Tolerance Leverage Spark Fault-Tolerance Mechanism
0 100 200 300 400 500 600 700 800 900
1000
No Failure Lineage Restart
Runt
ime
(Sec
onds
)
67
Graph-Processing Systems
oogle
Expose specialized API to simplify graph programming.
CombBLAS
Kineograph
X-Stream
GraphChi
Ligra
GPS
Representation
Vertex-Program Abstraction
i Pregel_PageRank(i, messages) : // Receive all the messages total = 0 foreach( msg in messages) : total = total + msg // Update the rank of this vertex R[i] = 0.15 + total // Send new messages to neighbors foreach(j in out_neighbors[i]) : Send msg(R[i]) to vertex j
68
The Vertex-Program Abstraction
GraphLab_PageRank(i) // Compute sum over neighbors total = 0 foreach( j in neighbors(i)): total += R[j] * wji // Update the PageRank R[i] = 0.15 + total
69
R[4] * w41
+ +
4 1
3 2
Low, Gonzalez, et al. [UAI’10]
F
E
Example: Oldest Follower
D
B
A
C Calculate the number of older followers for each user?
val olderFollowerAge = graph .mrTriplets(
e => // Map if(e.src.age > e.dst.age) { (e.srcId, 1) else { Empty } , (a,b) => a + b // Reduce ) .vertices
23 42
30
19 75
16 70
Enhanced Pregel in GraphX
71
pregelPR(i, messageList ): !// Receive all the messages !total = 0 !foreach( msg in messageList) : ! total = total + msg!
// Update the rank of this vertex !R[i] = 0.15 + total !
// Send new messages to neighbors !foreach(j in out_neighbors[i]) : ! Send msg(R[i]/E[i,j]) to vertex!
Require Message Combiners messageSum
messageSum
Remove Message Computation
from the Vertex Program
sendMsg(iàj, R[i], R[j], E[i,j]): ! // Compute single message ! return msg(R[i]/E[i,j]) ! !
combineMsg(a, b): ! // Compute sum of two messages ! return a + b ! !
PageRank in GraphX
72
// Load and initialize the graph !
val graph = GraphBuilder.text(“hdfs://web.txt”) !val prGraph = graph.joinVertices(graph.outDegrees) !!
// Implement and Run PageRank !
val pageRank = ! prGraph.pregel(initialMessage = 0.0, iter = 10)( ! (oldV, msgSum) => 0.15 + 0.85 * msgSum, ! triplet => triplet.src.pr / triplet.src.deg, !
(msgA, msgB) => msgA + msgB) !
Example Analytics Pipeline // Load raw data tables !
val articles = sc.textFile(“hdfs://wiki.xml”).map(xmlParser) !
val links = articles.flatMap(article => article.outLinks) !
// Build the graph from tables !
val graph = new Graph(articles, links) !
// Run PageRank Algorithm !
val pr = graph.PageRank(tol = 1.0e-5) !
// Extract and print the top 20 articles !
val topArticles = articles.join(pr).top(20).collect !
for ((article, pageRank) <- topArticles) { ! println(article.title + ‘\t’ + pageRank) !} !
Apache Spark Dataflow Platform Resilient Distributed Datasets (RDD):
Zaharia et al., NSDI’12
HDFS
HDFS
Map
Map
RDD RDD
Reduce
Lineage:
RDD
Load
Load
HDFS RDD RDD Reduce Map RDD
Load
.cache()
Persist in Memory