Chronos: A Graph Engine for Temporal Graph Analysis

43
Chronos: A Graph Engine for Temporal Graph Analysis Wentao Han 1,3 , Youshan Miao 2,3 , Kaiwei Li 1,3 , Ming Wu 3 , Fan Yang 3 , Lidong Zhou 3 , Vijayan Prabhakaran 3 , Wenguang Chen 1 , Enhong Chen 2 Tsinghua University 1 University of Science and Technology of China 2 Microsoft Research 3 1

description

Chronos: A Graph Engine for Temporal Graph Analysis. Wentao Han 1,3 , Youshan Miao 2,3 , Kaiwei Li 1,3 , Ming Wu 3 , Fan Yang 3 , Lidong Zhou 3 , Vijayan Prabhakaran 3 , Wenguang Chen 1 , Enhong Chen 2 - PowerPoint PPT Presentation

Transcript of Chronos: A Graph Engine for Temporal Graph Analysis

Page 1: Chronos: A Graph Engine for Temporal Graph Analysis

1

Chronos: A Graph Engine for Temporal

Graph Analysis

Wentao Han1,3, Youshan Miao2,3, Kaiwei Li1,3,

Ming Wu3, Fan Yang3, Lidong Zhou3,

Vijayan Prabhakaran3, Wenguang Chen1, Enhong Chen2

Tsinghua University1

University of Science and Technology of China2

Microsoft Research3

Page 2: Chronos: A Graph Engine for Temporal Graph Analysis

2

• Real-world graphs evolve – temporal graphs

• Temporal graph properties bring more insights

2013 20142012

2 01 2 2 01 3 2 01 402468

101214

Year

Use

r Ra

nkin

g

A Social Graph

Temporal Graphs

YEAR

Page 3: Chronos: A Graph Engine for Temporal Graph Analysis

3

2 01 2 2 01 3 2 01 402468

101214

Year

Use

r Ra

nkin

g

A Social Graph

Temporal Graphs

• Real-world graphs evolve – temporal graphs

• Temporal graph properties bring more insights

YEAR

Temporal ranks can tell their differences

2013 20142012

Page 4: Chronos: A Graph Engine for Temporal Graph Analysis

4

2013 20142012

2 01 2 2 01 3 2 01 402468

101214

Year

Use

r Ra

nkin

gYEAR

Temporal Graph AnalysisComputing properties on a series of graph snapshots

Graph snapshot

t0 t2

Static Graph

Analysis

Graph Properties

t1

Page 5: Chronos: A Graph Engine for Temporal Graph Analysis

5

2013 20142012

2 01 2 2 01 3 2 01 402468

101214

Year

Use

r Ra

nkin

g

Temporal Graph Analysis• Existing graph engines: targeting static graph analysis• A possible solution: computing snapshot by snapshot

YEAR

Task 1 Task 2 Task 3

Page 6: Chronos: A Graph Engine for Temporal Graph Analysis

6

Performance Issues

Page 7: Chronos: A Graph Engine for Temporal Graph Analysis

7

Propagation based graph computation model

Vertex Data Array

Edge Array

v2 ...v1 ...... v3 ...

scan

v1 → v2 v1 → v3... ...... v3 → v5 ...

Revisit: Static Graph Analysis

Local computation

Data Propagation

v1

v3

v2

v5

Page 8: Chronos: A Graph Engine for Temporal Graph Analysis

8

Propagation based graph computation model

Vertex Data Array

Edge Array

v2 ...v1 ...... v3 ...

scan

v1 → v2 v1 → v3... ...... v3 → v5 ...

Revisit: Static Graph Analysis

Local computation

Data Propagation

v1

v3

v2

v5

Cache Miss

Page 9: Chronos: A Graph Engine for Temporal Graph Analysis

9

In parallel: Partition graph & computations among CPU cores

Revisit: Static Graph Analysis

v2 ...v1 ...... v3 ...

Core 0 Core 1

scanCore 0 Core 1

v1 → v2 v1 → v3... ...... v3 → v5 ...

Core 0

Core 1

v1

v3

v2

v5

Cross-partition edgeVertex Data Array

Edge Array

Inter-core Communication

Page 10: Chronos: A Graph Engine for Temporal Graph Analysis

10

Temporal Graph Analysis: Snapshot by Snapshot

Computation on multiple graph snapshot – multiple cost

N snapshotsÞ N cache missesÞ N inter-core comm.

v2' ...v1' ...... v3' ...

v2” ...v1” ...... v3” ...

Snapshot2

Snapshot3

Vertex Data Arrays

v2 ...v1 ...... v3 ...

Snapshot1

Page 11: Chronos: A Graph Engine for Temporal Graph Analysis

11

Real-world graph often evolve gradually (Similar snapshots)

Observations

v1

v3

v2

v5

v4

v1

v3

v2

v5

v4

v1

v3

v2

v5

v4

Snapshot 2Snapshot 1 Snapshot 3

' '

''

'

"

"

" "

"

Page 12: Chronos: A Graph Engine for Temporal Graph Analysis

12

Similar propagations across snapshots

Observations

v1

v3

v2

v5

v4

v1

v3

v2

v5

v4

v1

v3

v2

v5

v4'

' '

''

"

""

"

"

Snapshot 2Snapshot 1 Snapshot 3

Page 13: Chronos: A Graph Engine for Temporal Graph Analysis

13

Group propagations by source & target, not by snapshot

Idea

v1

v3

v2

v5

v4

v1

v3

v2

v5

v4

v1

v3

v2

v5

v4'

' '

''

"

""

"

"

Step 1 Step 2 Step 3 Step 4

Step 1 Step 2 Step 3

1 41 3 1 51 2Propagations:

Snapshot 2Snapshot 1 Snapshot 3

Page 14: Chronos: A Graph Engine for Temporal Graph Analysis

14

Chronos: Data Layout

• Place together data for the same vertex across multiple snapshots

fit in a cache line

v2 ...v1 ...... v3 ...

v2' ...v1' ...... v3' ...

v2” ...v1” ...... v3” ...

Snapshot1

Snapshot2

Snapshot3

Vertex Data Arrays (snapshot-by-snapshot)

v2v1 ...... ... v2'v1' ...v2”v1” v3 v3' v3” ...

(with time-locality)Snapshot

1, 2, 3

Vertex Data Array (Chronos)

Page 15: Chronos: A Graph Engine for Temporal Graph Analysis

15

Chronos: Propagation Scheduling• Locality Aware Batch Scheduling (LABS):

• Batching propagating across snapshots

vertex 1 -> vertex 2across snapshots

v2v1 ...... ... v2'v1' ...v2”v1” v3 v3' v3” ...

Vertex Data Array

Edge Array

... v1 → v3 v1'→ v3' v1”→ v3” ...v1 → v2 v1'→ v2' v1”→ v2”

fit in a cache line

scan

vertex 1 -> vertex 3across snapshots

Page 16: Chronos: A Graph Engine for Temporal Graph Analysis

16

Chronos: Propagation Scheduling• Locality Aware Batch Scheduling (LABS):

• Batching propagating across snapshots

v2v1 ...... ... v2'v1' ...v2”v1” v3 v3' v3” ...

Vertex Data Array

Edge Array

... v1 → v3 v1'→ v3' v1”→ v3” ...v1 → v2 v1'→ v2' v1”→ v2”v1 → v2... v1 → v3 v1'→ v3' v1”→ v3” ...v1'→ v2' v1”→ v2”v1 → v2 v1'→ v2' v1”→ v2”... v1 → v3 v1'→ v3' v1”→ v3” ...

fit in a cache line

N propagationsÞ 1 cache misses

Cache Hit

scan

Page 17: Chronos: A Graph Engine for Temporal Graph Analysis

17

Chronos: Propagation Scheduling• Locality Aware Batch Scheduling (LABS):

• Batching propagating across snapshots

v2v1 ...... ... v2'v1' ...v2”v1” v3 v3' v3” ...

Vertex Data Array

Edge Array

... v1 → v3 v1'→ v3' v1”→ v3” ...v1 → v2 v1'→ v2' v1”→ v2”v1 → v2... v1 → v3 v1'→ v3' v1”→ v3” ...v1'→ v2' v1”→ v2”v1 → v2 v1'→ v2' v1”→ v2”... v1 → v3 v1'→ v3' v1”→ v3” ...

Core 0 Core 1

v1 → v2 v1 → v3v1'→ v2' v1”→ v2”... v1'→ v3' v1”→ v3” ...

N propagationsÞ 1 inter-core comm.

access in a batchInter-core Communication

scan

Page 18: Chronos: A Graph Engine for Temporal Graph Analysis

18

LABS: The Key of Chronos

• A graph layout

• Place together vertex/edge data across snapshots

• A scheduling mechanism

• Batch propagations across snapshots

• Efficient

• Reduced cache miss / inter-core comm.

Page 19: Chronos: A Graph Engine for Temporal Graph Analysis

19

Experimental Evaluation

Graph # of Vertices # of Edge Events

Time Span Source

Wiki 1.9 M 40.0 M 6 years Wikipedia graph from KONECT

Twitter 7.5 M 61.6 M 3 months Provided by Twitter

Weibo 27.7 M 4.9 B 3 years Crawled from Sina Weibo

Web 133.6 M 7.2 B 12 months Web graph from DELIS

• Large temporal graphs

• Various graph algorithms• PageRank

• Weakly-connected components (WCC)

• Single-source shortest path (SSSP)

• Maximal independent set (MIS)

• Sparse matrix-vector multiplication (SpMV)

CPU 2.4GHz 16-Core

RAM 128GB

DISK 1TB SSD

• Settings

Page 20: Chronos: A Graph Engine for Temporal Graph Analysis

20

Chronos: Single-Thread Effectiveness

0 8 16 24 32123456789

10

Temporal Graph Analysis on Wiki

WCC

Pagerank

SSSP

BatchSize

Spee

dup

5~9x speedup

Baseline: Snapshot by snapshot

1

Page 21: Chronos: A Graph Engine for Temporal Graph Analysis

21

Chronos: Single-Thread Effectiveness

Reduced cache misses

92%

95%70%

L1d Cache Miss LLC Cache Miss dTLB Miss0

1,0002,0003,0004,0005,0006,0007,0008,0009,000

10,0008,759

649

3,4623,865

584 1,0031,107265 287687 196 160

Cache Miss Reduction

BatchSize=1 BatchSize=4 BatchSize=16 BatchSize=32

Cach

e m

iss #

(in

mill

ions

)

Page 22: Chronos: A Graph Engine for Temporal Graph Analysis

22

Chronos: Multi-Core Performance

More than to 10x faster

0 4 8 12 160

102030405060708090

PageRank on Wiki

Snapshot-by-snapshotChronos

# of Cores

Spee

dup

1

10x

Page 23: Chronos: A Graph Engine for Temporal Graph Analysis

23

Chronos: Multi-Core Performance

2 4 810

100

1000

10000

977.64

2471.64244.2

23.08

58.56105.2

Reduced Inter-Core Communications

No LABSLABS

Number of Cores

Com

mun

icati

on N

um.

(in M

illio

ns)

Reduced inter-core comm.

98%98%

98%

Page 24: Chronos: A Graph Engine for Temporal Graph Analysis

24

More in Paper:

• Graph computation modes

• All benefit from LABS

Push Mode Pull Mode Stream Mode

v1

v2

v3

v4

v5

v2

v1

v6

v7

v8

v3v1

v5v1

v2v6

v2v8

Page 25: Chronos: A Graph Engine for Temporal Graph Analysis

25

More in Paper:

• Incremental graph computation

• Leveraging the previous snapshot’s result

• Computing only the changed part

• Can be enhanced with LABS

Page 26: Chronos: A Graph Engine for Temporal Graph Analysis

26

Conclusion

• Temporal graph analysis• an emerging class of applications

• Chronos • supports analysis of temporal graphs efficiently

• Joint design of data layout and scheduling• Leveraging the temporal similarity of graphs• Exploit data locality esp. in time dimension

Page 27: Chronos: A Graph Engine for Temporal Graph Analysis

27

Thank You!

Questions?

Tsinghua University

University of Science and Technology of

China

MicrosoftResearch

Page 28: Chronos: A Graph Engine for Temporal Graph Analysis

28

BACKUP

• Experiment Environment Details• Real Graphs Similarities over Time• Batch Size Discussion• LABS Locking• LABS with Incremental Computation• LABS on Cluster• Related Work

Page 29: Chronos: A Graph Engine for Temporal Graph Analysis

29

Experiment Setup

CPU 2.4GHz Intel Xeon E5-2665 16-core

RAM 128GB

DISK 1TB SSD (RAID 0 with 372GB1 *3)

Network InfiniBand (DDR, 40Gb/s)

ClusterSize 4

1. SSD model: TOSHIBA MK4001GRZB

Page 30: Chronos: A Graph Engine for Temporal Graph Analysis

30

Temporal Distributions of Graphs• Edges increase gradually

6%13%

19%25%

31%38%

44%50%

56%63%

69%75%

81%88%

94%100%

0%10%20%30%40%50%60%70%80%90%

100%

wiki

Ratio of time range

Num

ber o

f Edg

es

6% 13%

19%

25%

31%

38%

44%

50%

56%

63%

69%

75%

81%

88%

94%

100%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

weibo

Ratio of time range

Num

ber o

f Edg

es

Page 31: Chronos: A Graph Engine for Temporal Graph Analysis

31

On-disk Temporal Graph

Ci: checkpoint of vi: Edges without time informationaij: j-th activity of vi: Edge changes, e.g., <addE, (v0, v3, w), t2 >

Snapshot Groups

A Snapshot Group

Snapshot Group 0 Snapshot Group 1

Timeindex

......

...... C0 a0,1 ... C1 ...

Vertexindex

a0,t a1,1 a1,t

Edge activities of v0 Edge activities of v1

Edge data for v0 Edge data for v1

Page 32: Chronos: A Graph Engine for Temporal Graph Analysis

32

LABS: In-memory Design

... ...

Vertexindex

Edges of v1

Temporal Edge

(v1)→ v2 110 (v1)→ v3 111 ... ...Edge Array

Vertex Data Array

indicate which snapshots the edge exists in

v2v1 ...v2'v1' v2”v1”...

Vertexindex

Data of v1 Data of v2

v1 → v2 v1'→ v2' v1”→ v2”LogicallyEquals to:

Page 33: Chronos: A Graph Engine for Temporal Graph Analysis

33

Temporal Graph Re-construction• User input time points: 0, 10, 20• Scan the graph activity log [Type, Endpoints, Time]:

addE, v0->v1, 0addE, v0->v2, 15addE, v0->v3, 6delE, v0->v3, 8

• Temporal edges [Endpoints, BitSet]:v0->v1, 111v0->v2, 001

Page 34: Chronos: A Graph Engine for Temporal Graph Analysis

34

Temporal Properties

Chronos System OverviewOn-Disk Temporal Graph

Contains all the graph

evolving activities

Contains only snapshots of

interest

In-Memory Temporal Graph

v2v1 ...v2'v1' v2”v1”...

... ... (v1)→ v2 111 (v1)→ v3 111 ... ...

User input multiple time points

Scanactivities(log)Reconstruct

graph snapshots

Page 35: Chronos: A Graph Engine for Temporal Graph Analysis

35

Greater Batch Size of LABS

• Pros

• Possible to further reduce cache miss / inter-core comm.

• Cons

• Bit wide limit of the instruction: _BitScanForward64

• Less snapshot similarity within a batch

• No more cache miss / inter-core comm. to reduce

• False sharing with locking

Page 36: Chronos: A Graph Engine for Temporal Graph Analysis

36

Compute Snapshot by Snapshot (another way)

Vertex Data Array

v2 ...v1 ...... v3 ...

Þ 3 cache missesÞ 3 inter-core comm.

v2' ...v1' ...... v3' ...

v2” ...v1” ...... v3” ...

Cache Miss

Snapshot1

Snapshot2

Snapshot3

Inter-core communication

Core 0 Core 1

Core 0

Core 1

Core 2

• Snapshot-Parallelism

Page 37: Chronos: A Graph Engine for Temporal Graph Analysis

Partition-Parallelism

Snapshot-Parallelism

LABS-Parallelism

Cache Miss More More Less

Inter-core Communications More No Less

Parallelization -- Summary

37

Snapshot by snapshot LABS

Good partitioning: Num. of intra-partition edge > Num. of inter-partition edge

?

Partition-Parallelism: Computing partitions of the same snapshot in parallelSnapshot-Parallelism: Computing snapshots in parallelLABS-Parallel: Computing LABS-batched partition in parallel

Page 38: Chronos: A Graph Engine for Temporal Graph Analysis

38

LABS Performance on Multi-Core

LABS-Parallelism out-performs

0 4 8 12 160

102030405060708090

PageRank on Wiki

Partition-Parallelism

LABS-Parallelsm

Snapshot-Parallelism

# of Cores

Spee

dup

1

Baseline: Single Core

Page 39: Chronos: A Graph Engine for Temporal Graph Analysis

39

LABS Performance on Cluster

• A small cluster with 4 machines

• Benefit less than in single machine test• The benefit of LABS hided by the high overhead of network

PageRank WCC SSSP10

100

1000

10000 7318 6405

518

20021250

48

Baseline LABS

Tim

e (s

)

Up to 10x speed up

Page 40: Chronos: A Graph Engine for Temporal Graph Analysis

40

Reduced Lock Contentions

• LABS amortizes the lock cost across snapshots• PageRank on the Wiki graph

2 4 8 160

20

40

60

80

100

120

28.85 34.2547.54

96.73

1.32 1.34 1.85 4.02

No LABSLABS

Number of Cores

Lock

tim

e (s

econ

d)

Reduced the time of locking by more than 95%

95% 96%96%

96%

Page 41: Chronos: A Graph Engine for Temporal Graph Analysis

41

LABS with Incremental Computation• Traditional incremental computing

• Incremental computing with LABSSnapshot

0Snapshot

1Snapshot

2Snapshot

3

Snapshot0

Snapshot1

Snapshot2

Snapshot3

Apply LABS(BatchSize = 3)

Incremental Computing

Page 42: Chronos: A Graph Engine for Temporal Graph Analysis

42

Gain of Incremental LABS

1 10 1000%

10%

20%

30%

40%

50%

60%

70%

WCCSSSP

Batch size

Impr

ovem

ent (

%)

Baseline: Traditional Incremental

Page 43: Chronos: A Graph Engine for Temporal Graph Analysis

43

Related work• Existing Graph Engines – static graph engines

• Pregel (SIGMOD’10)• Powergraph (OSDI’12)• GraphLab (VLDB’12)• Grace (ATC’12)• X-stream (SOSP’13)• …

• Active studies on changes and new concepts in evolving graph

• Densification law, “Shrinking diameters” diameter (KDD’05)

• PageRank (CIKM’07), Facebook user activities (EuroSys’09), centrality in

evolving graph (MLG’10), retweet after N friends’ retweets (WWW’11),

Rumors detection (SOMA’10)…