Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren,...
-
Upload
german-perryman -
Category
Documents
-
view
215 -
download
0
Transcript of Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren,...
![Page 1: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/1.jpg)
CLUDE: An Efficient Algorithm for LU Decomposition Over a Sequence of Evolving Graphs
Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung
The University of Hong Kong{chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk
![Page 2: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/2.jpg)
2
Modeling the World as Graphs
Social networks Web
![Page 3: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/3.jpg)
3
Graph-based Queries
Personalized PageRank
Random Walk with Restart
Discounted Hitting Time
SALSA
PageRank Measures of the importance of nodes
Measures of the proximities between nodes
![Page 4: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/4.jpg)
4
Introduction
A common property: Computing them requires solving linear systems
PR SALSA PPR DHT RWR
# of nodes in the graph: n
A: n x n matrix, captures the graph structure
b: vector of size n, depends on the measures computed, input query vector
x: vector of size n, gives the measures of the nodes in the graph
![Page 5: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/5.jpg)
5
Example: Random Walk with Restart RWR
u
2
3
4
With a probability d, transit to a neighboring nodeWith a probability (1-d), transit to the starting nodex (v) steady-state probability that we are at
node v
A: derived from the
graph
b: RWR with starting node 1
x: RWR scores
![Page 6: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/6.jpg)
6
Graphs Evolve over Time
Evolving Graph Sequence (EGS) [VLDB’11]
Time
…
measure measure measure measure …Information modeled by graph changes over time.
![Page 7: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/7.jpg)
7
Example:PR Score Trend Analysis
Wikipedia,20,000 Wiki pages,1000 daily snapshots
Key moments:PR score changes significantly
![Page 8: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/8.jpg)
8
Evolving Matrix Sequence (EMS)
Evolving Matrix Sequence (EMS)
Objective: efficiently compute various measures over an EMS
![Page 9: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/9.jpg)
9
Challenges
many b’sRWR score between any two nodes n b’s
many A’sEach matrix in the EMS1 year daily snapshots 365 A’s
LU decomposition
![Page 10: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/10.jpg)
10
LU Decomposition (LUDE)
Solving LUx 1
=
b 1
Solv
ing
LUx 2
=
b 2
Solving LUxq =
bq
Much faster
than LU
Forward & backward substitutions
LU factors
![Page 11: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/11.jpg)
11
Fill-ins in LUDE
#fill-ins: 8 (fill-in: An entry that is 0 in A but becomes non-zero in L and U)
More fill-ins will cause: More space to store (L, U) More time to do forward/backward substitutions in
solving LUx = b More time to do LU decomposition
![Page 12: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/12.jpg)
12
Preserving Sparsity in LUDE:Matrix Reordering
#fill-ins: 8 (fill-in: An entry that is 0 in A but becomes non-zero in L and U)
#fill-ins: 1
![Page 13: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/13.jpg)
13
Preserving Sparsity in LUDE:Matrix Ordering
Finding the optimal ordering to minimize #fill-ins is NP-complete
Effective heuristic reordering strategies Markowitz AMD Degree …
Most effective
![Page 14: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/14.jpg)
14
Challenges
LU decomposition
LU decomposition for all A’s
many b’sRWR score between any two nodes n b’s
many A’sEach matrix in the EMS1 year daily snapshots 365 A’s
Reordering+
Reordering for all A’s
+
![Page 15: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/15.jpg)
15
LUDE over an EMS (LUDEM) Problem
How many orderings should be
computed?
T orderings?1 ordering?
Others?
The EMS gradually evolves over time:successive graphs in Wiki share 99%
of edges
Can we apply incremental methods?
![Page 16: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/16.jpg)
16
Brute Force (BF): T orderings
best ordering quality but slow
Marko
witz
ord
erin
gs
![Page 17: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/17.jpg)
17
Straightly Incremental (INC): 1 ordering
Bennett‘s Incremental LUDE [1965’]
bad ordering!
![Page 18: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/18.jpg)
18
Cluster-based Incremental (CINC)Cluster 1
Cluster M
Tradeoff between good ordering and fast incremental LUDE
![Page 19: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/19.jpg)
19
Overhead of Structural Change
1. Structure allocation to store
LU factors
2. Numerical computatio
n
Zooming in
70%
Adjacency-lists structures
Bennett’s incremental LU
![Page 20: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/20.jpg)
20
Solution: Universal Static Structure
Universal Static Structure
(Able to accommodate non-zero entries of LU factors of all matrices in a cluster)
Cluster
![Page 21: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/21.jpg)
21
Solution: Universal Static Structure
Universal Static Structure
(Able to accommodate non-zero entries of LU factors of all matrices in a cluster)
Cluster
![Page 22: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/22.jpg)
22
CLUDE: Fast Cluster-based LU Decomposition
Cluster 1
Cluster M
No structural change overhead, better ordering quality
with static structure
![Page 23: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/23.jpg)
23
Experimental Setup
Datasets Two real datasets (which derive two EMS’s)▪ Wiki (pages and their hyperlinks) default▪ DBLP (authors and their co-authorships)
Synthetic EMSs Settings
Java, Linux, CPU: 3.4GHz Octo- Core, Memory: 16G
Dataset #snapshots
|V| |E1| |Elast|
Wiki 1000 20,000 56,181 138,072
DBLP 1000 97,931 387,960 547,164
![Page 24: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/24.jpg)
24
Evaluation of a Solution
Ordering quality Quality-loss of an ordering O of A:
Efficiency Speedup over BF’s execution time
O*: Markowitz ordering of A
# of extra fill-ins
![Page 25: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/25.jpg)
25
Ordering Quality: Inc
INC applies Markowitz ordering of A1 to all matrices in the whole EMS
Snapshot number
Snapshot #
![Page 26: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/26.jpg)
26
Ordering Quality: CINC, CLUDE
CINC applies Markowitz ordering of A1 to all matrices in the clusterCLUDE applies Markowitz ordering of AU to all matrices in the cluster
![Page 27: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/27.jpg)
27
Efficiency
Reasons of the big gap between CLUDE and CINC:(1) CLUDE gives better ordering quality
(2) CLUDE uses static data structures for storing the matrices’ LU factors
![Page 28: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/28.jpg)
28
Synthetic Dataset
General observation:
CLUDE gives the best ordering quality,
at the same time is much faster than INC
and CINC
![Page 29: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/29.jpg)
29
Related Work
EGS processing Computation of shortest path distance between two nodes across
a graph sequence Computation of various measures
(PR/SALSA/PPR/DHT/RWR) on single graphs Approximation methods (power iteration, Monte Carlo)▪ Two order of magnitude faster if A is decomposed
Sparse matrix decomposition Maintaining measures incrementally
Approximation methods▪ An order of magnitude faster
Graph streams How to detect sub-graphs that change rapidly over small window
of the stream Graphs that arrive in the stream are not archived
![Page 30: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/30.jpg)
30
Conclusions
We studied the LUDEM problem Interesting structural analyses on a
sequence of evolving graphs can be carried out efficiently
We designed CLUDE for the LUDEM problem based on matrix ordering and incremental LU decomposition
CLUDE outperformed others in terms of both ordering quality and speed
![Page 31: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/31.jpg)
31
Q & A
Thank you!
Contact Info: Luyi MoUniversity of Hong [email protected]
http://www.cs.hku.hk/~lymo
![Page 32: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/32.jpg)
32
Our Solutions
LU decomposition
LU decomposition for all A’s
many b’smany A’s
BF: T orderings (1 ordering for 1 matrix)best ordering, slowINC: 1 ordering (for all matrices)bad ordering, slow
CINC: cluster-basedgood ordering, fast
CLUDE: cluster-based, static structuregood ordering, fastest
![Page 33: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/33.jpg)
33
Example2: Analysis of Actions to Improve PR Score
Translating the web page
Publicizing the web site through newsletters
Providing a rich site summary
…
How to evaluate the effectiveness of these actions?
Actions taken Changes to PR score
offi
cial
guid
e
![Page 34: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/34.jpg)
34
Clustering Algorithm
Segmentation clustering algorithm:A cluster consists of successive snapshotsA cluster satisfies:
EMS
![Page 35: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/35.jpg)
35
Future Work
Distributed algorithms
Key moment detection Key moment of a measure over an EGS: the
moment at which the measure score changes dramatically
![Page 36: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/36.jpg)
36
LUDEM-QC Problem (For Symmetric EMS)
It can be easily computed for
symmetric matrices
![Page 37: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/37.jpg)
37
Solutions for LUDEM-QC
Key: Control the size of the cluster The smaller the cluster is, the higher the
chance the CINC or CLUDE satisfy the quality constraint
Beta-clustering algorithms are thus proposed
![Page 38: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/38.jpg)
38
Synthetic Dataset
![Page 39: Chenghui Ren, Luyi Mo, Ben Kao, Reynold Cheng, David Cheung The University of Hong Kong {chren, lymo, kao, ckcheng, dcheung}@cs.hku.hk.](https://reader035.fdocuments.us/reader035/viewer/2022062620/551a8dc8550346e0158b4fbd/html5/thumbnails/39.jpg)
39
Case Study
In 1992, IBM and HARRIS announced their alliance to share technology
HARRIS’s stock price hit a closing high shortly after the announcement