SnapNETS: Automatic Segmentation of Network Sequences with Node Labels
-
Upload
sorour-ekhtiari-amiri -
Category
Data & Analytics
-
view
73 -
download
2
Transcript of SnapNETS: Automatic Segmentation of Network Sequences with Node Labels
![Page 1: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/1.jpg)
SnapNETS: Automatic Segmentation of Network Sequences
with Node Labels
Sorour E. Amiri, Liangzhe Chen, B. Aditya PrakashDepartment of Computer Science
Virginia Tech
AAAI, San Francisco, USA, February 9, 2017
![Page 2: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/2.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS Experiments Conclusion
Amiri, Chen, Prakash 2
![Page 3: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/3.jpg)
Network SequencesEpidemiology: disease spreads over contact networks
Social Media: Information spreads over friendship networks
3
Flu
Meme
Amiri, Chen, Prakash
G1 G2 G3 G4
G1 G2 G3 G4
Uninfected
Infected
Inactive
Active
![Page 4: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/4.jpg)
Making sense of network sequences
4
Flu
when do the infection patterns change?
Star Bridge Near Clique
Reason:• Virus mutation• Vaccination• …
Amiri, Chen, Prakash
G1 G2 G3 G4Uninfected
Infected
![Page 5: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/5.jpg)
Making sense of network sequences
5
Meme Reason:• Event• …
Star Clique
when do the activation patterns change?
Amiri, Chen, Prakash
G1 G2 G3 G4
Inactive
Active
![Page 6: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/6.jpg)
Problem 1: Network sequence segmentation
Given a sequence of networks with labeled nodes, Find the best segmentation which captures:
Different distribution of node labels.
6
Star Bridge Near CliqueAmiri, Chen, Prakash
G1 G2 G3 G4
In this work: Binary labels {0, 1}
![Page 7: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/7.jpg)
Desirable Properties P1. Parameter-free:
• No threshold, No fixed granularity
P2. Comprehensive: • Use the entire graph
P3. Scalable
7Amiri, Chen, Prakash
![Page 8: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/8.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS Experiments Conclusion
8Amiri, Chen, Prakash
![Page 9: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/9.jpg)
Alternative 1: Feature Ext. &Time-series
9
0 0 0 … 2F1: #cliques (of active subgraph)
F2: #ladders (of inactive subgraph)
F3: #ladders (of active subgraph)
1 1 0 … 0
0 0 0 … 1
[Henderson et al. 2010] [Likas, Vlassis, and Verbeek 2003] [Li et al. 2009]
Amiri, Chen, Prakash
G1 G2 G3 G4-1
0
1
2
Features time series
F1 F2 F3
Step 1: Feature Extraction
Step 2: Time-series segmentationG1 G2 G3 G4
…
![Page 10: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/10.jpg)
Alternative 1: Feature Ext. &Time-series
Drawbacks: Laborious feature-engineering
o # Cliqueso # Ladders
“Local” change detection:o One aggregation time periodo Threshold
10Amiri, Chen, Prakash
G1 G2 G3 G4-1
0
1
2
Features time series
F1 F2 F3
G1 G2 G3 G4
![Page 11: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/11.jpg)
Alternative 2: Plain-graph-based analysis
11
[Shah et al. 2015] [Sun et al. 2007] [Lin et al. 2009] [Qu et al. 2014]
Step 1: Extract active subgraphs
Amiri, Chen, Prakash
Step 2: Dynamic graph segmentation
G1 G2 G3 G4
G1 G2 G3 G4 G1 G2 G3 G4
![Page 12: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/12.jpg)
Alternative 2: Plain-graph-based analysis
Drawbacks: Inactive nodes are important to detect different patterns
Amiri, Chen, Prakash
Entire graphDynamic graph segmentation
10
G1 G2 G3 G4 G1 G2 G3 G4
Chain Roles are different
![Page 13: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/13.jpg)
Desirable Properties P1. Parameter-free:
• No threshold, No fixed granularity
P2. Comprehensive: • Use the entire graph
P3. Scalable
13Amiri, Chen, Prakash
Comparison of SnapNETS
![Page 14: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/14.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS
Main Idea and Overview Goal 1: Summarizing Act-snapshots Goal 2: Constructing the segmentation graph Goal 3: Finding the best segmentation
Experiments Conclusion
14Amiri, Chen, Prakash
![Page 15: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/15.jpg)
Nodes: For each segment there is a node + {Source (‘s’), Target (‘t’)} Source (‘s’) = start time Target (‘t’) = end time
Edges: There is a directed edge between adjacent nodes
Main Idea: Segmentation graph
15Amiri, Chen, Prakash
Best segmentation problem Path optimization problem
Inpu
t
Segmentation G
raph
![Page 16: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/16.jpg)
Overview of SnapNETS Goal 1. Summarize each graph:
Keep structural and label dependent properties
Goal 2. Construct Segmentation graph:Define nodes and edgesDefining edges weights
o extract the features of summarized graphs
Goal 3. Find the best segmentation:Define the best segmentation (path)Compute the best segmentation
16Amiri, Chen, Prakash
![Page 17: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/17.jpg)
Technical Challenges Using the entire graph snapshots:
Summarize graph while satisfying P2
Finding the number of segments: Compute segmentation while satisfying P1
17
Reminder: P1. Parameter-free P2. Comprehensive P3. Scalable
Amiri, Chen, Prakash
![Page 18: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/18.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS
Main Idea and Overview Goal 1: Summarizing Act-snapshots Goal 2: Constructing the segmentation graph Goal 3: Finding the best segmentation
Experiments Conclusion
18Amiri, Chen, Prakash
![Page 19: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/19.jpg)
Goal 1: Summarizing graph snapshots
We want to preserve Structural properties Nodes labels
Role of Eigenvalue:
19Amiri, Chen, Prakash
Epidemic threshold in most diffusion models [Prakash et al. ICDM 2011]
Same Same diffusive properties
Leading eigenvalue of Adjacency matrix
![Page 20: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/20.jpg)
20
Our summarization approach We want to get a smaller graph with similar eigenvalues:
Successively merge nodes
Amiri, Chen, Prakash
![Page 21: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/21.jpg)
Problem 2: Graph summarization Given: A graph with labeled nodes and a compression ratio. Find: a coarsened graph such that:
21Amiri, Chen, Prakash
![Page 22: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/22.jpg)
Keep leading eigenvalue Matrix perturbation approach
Based on CoarsNet [Purohit et al. KDD 2014] Successively merge nodes Do not merge nodes with different labels
Our Approach
22
Given: A graph with labeled nodes and a compression ratio.Find: a coarsened graph such that:
Amiri, Chen, Prakash
0.10.1 0.1
0.2
0.2
…
…
![Page 23: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/23.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS
Main Idea and Overview Goal 1: Summarizing Act-snapshots Goal 2: Constructing the segmentation graph Goal 3: Finding the best segmentation
Experiments Conclusion
23Amiri, Chen, Prakash
![Page 24: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/24.jpg)
Nodes: For each segment there is a node + {Source (‘s’), Target (‘t’)} Source (‘s’) = start time Target (‘t’) = end time
Edges: There is a directed edge between adjacent nodes
Goal 2: Segmentation graph
24Amiri, Chen, Prakash
![Page 25: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/25.jpg)
Edge Weights
25
How can we measure the distance between two segments?Amiri, Chen, Prakash
w ?
![Page 26: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/26.jpg)
Our Approach Step 1: Extract features from summary graphs:
Easier and more efficient than on original graphs. No complex features
26Amiri, Chen, Prakash
F = [3.9, 13,..., 2.2]
![Page 27: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/27.jpg)
Step 2: Distance of adjacent segments
27
Edge Weights
Amiri, Chen, Prakash
w
![Page 28: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/28.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS
Main Idea and Overview Goal 1: Summarizing Act-snapshots Goal 2: Constructing the segmentation graph Goal 3: Finding the best segmentation
Experiments Conclusion
28Amiri, Chen, Prakash
![Page 29: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/29.jpg)
Goal 3: Finding the best segmentation Observation:
For each segmentation there is a path from ‘s’ to ‘t’For each path from ‘s’ to ‘t’ there is a segmentation
Therefore,• Best segmentation problem Path optimization problem
29Amiri, Chen, Prakash
![Page 30: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/30.jpg)
Possible approach Longest path? Given a segmentation graph Find the longest path from ‘s’ to ‘t’
30
Over segmentation problem
s t. . .
s t0.01 0.01 0.01 0.01
0.9 0.9 0.9
Sum = 3
Sum = 2.7Amiri, Chen, Prakash
![Page 31: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/31.jpg)
Problem 3: Finding the best segmentation
Our idea: Average longest path
Advantages: Parameter free Naturally balances weight of the path with the number of segments.
31
Given a segmentation graphFind the average longest path from ‘s’ to ‘t’
Amiri, Chen, Prakash
![Page 32: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/32.jpg)
Solving ALP Finding the ALP in general graphs is NP-hard. The segmentation graph is a DAG ALP can be solved in
polynomial time State-of-the-art algorithm [Waggoner et al. WACV 2013]
32Amiri, Chen, Prakash
Time complexity:
Cubic: Not scalable!
![Page 33: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/33.jpg)
Our Solution: LAYERED-ALP
Amiri, Chen, Prakash 33
Dynamic Programming Optimal solution
lp1 = Longest path with 1 segment
lp2 = Longest path with 2 segments
lp4 = Longest path with 4 segments
![Page 34: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/34.jpg)
Our Solution: LAYERED-ALP
Amiri, Chen, Prakash 34
Time Complexity:
Linear!
Build Layers
Find LP in each layer
Find ALP
![Page 35: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/35.jpg)
Complete algorithm
35
Time complexity:
Amiri, Chen, Prakash
Sub-quadratic
![Page 36: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/36.jpg)
Complete algorithm: Parallel
36
Time complexity:
Amiri, Chen, Prakash
![Page 37: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/37.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS
Main Idea and Overview Goal 1: Summarizing Act-snapshots Goal 2: Constructing the segmentation graph Goal 3: Finding the best segmentation
Experiments Conclusion
37Amiri, Chen, Prakash
![Page 38: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/38.jpg)
Experiments: datasets Different Domains with range of sizes:
BA-degree: Random Barabasi Albert graph AS-Oregon: Autonomous Systems peering information Higgs: Tweets dataset (with the follower-followee network) Portland: Contact network between people of Portland Memetracker: Who-copies-from-whom blog and website network IranElect: Follower-followee network of Twitter related to the Iran
election. DBLP: Co-authorship network related to ‘network’ topic.
38Amiri, Chen, Prakash
![Page 39: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/39.jpg)
Experiments: baselines DYNAMMO [Li et al. KDD 2009]:
Change point detection ( Reconstruction errors) # segments = # segments of SnapNETS .
K-means [Likas et al. Pattern Recognition 2003]: segment when a new cluster is detected
VOG [Koutra et al. SDM 2014]: 10 most important sub-structures Cut when the set of sub-structures changes significantly
o (threshold = the one gives the best result)
39Amiri, Chen, Prakash
Feature Extraction & time series
Dynamic graph
![Page 40: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/40.jpg)
Experiments: baselines-variations SN-ORIG: Original graphs instead of summary graphs SN-LP: Longest Path instead of ALP SN-GREEDY: Greedy Approach instead of ALP
40Amiri, Chen, Prakash
![Page 41: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/41.jpg)
Experiments: Quantitative analysis
41
SnapNETS outperforms the baselines Clear patterns in summary graphs
Infection moves to new community
As-Oregon
Amiri, Chen, Prakash
![Page 42: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/42.jpg)
Case studies: Memetracker
42
Televised vice-presidential debates
Summary graphs are close to the case when all nodes have the same label (f5)
Random nodes are active (f8)
Summary graphs are substantially sparser (f2).
Many active nodes got merged into important nodes such as CNN and BBC to form hubs (f6)
Amiri, Chen, Prakash
Can I call you joe?
![Page 43: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/43.jpg)
Case studies: AS-Oregon
43
New community New segment
Amiri, Chen, Prakash
![Page 44: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/44.jpg)
44
Scalability
Amiri, Chen, Prakash
Scalability of SNAP NETS Speedup by parallelizing construction of segmentation graph
Near-linear
![Page 45: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/45.jpg)
Outline Motivation Alternative Approaches Our Proposed Method: SnapNETS
Main Idea and Overview Goal 1: Summarizing Act-snapshots Goal 2: Constructing the segmentation graph Goal 3: Finding the best segmentation
Experiments Conclusion
45Amiri, Chen, Prakash
![Page 46: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/46.jpg)
Discussion: SnapNets Patterns:
the ‘placement’ and ‘connection’ of active/inactive nodes:
• structural (e.g. community/role/centrality) • rate changes.
Global method: SnapNETS is a ‘global’ method and not simply a change-point detection method.
46Amiri, Chen, Prakash
Graph summarization and features
Average Longest Path
Properties: P1. Parameter-freeP2. ComprehensiveP3. Scalable
![Page 47: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/47.jpg)
Future Work Handle dynamic graphs with varying
nodes and edges More node labels and real valued features Work with partially observed graphs
47Amiri, Chen, Prakash
![Page 48: SnapNETS: Automatic Segmentation of Network Sequences with Node Labels](https://reader036.fdocuments.us/reader036/viewer/2022062401/58cf330b1a28ab00168b5d4b/html5/thumbnails/48.jpg)
Any questions?
48
Funding:
Code at: https://github.com/SorourAmiri/SnapNETS
Sorour E. Amiri Liangzhe Chen B. Aditya Prakash
Goal 1 Goal 2 Goal 3Finding the best segmentation
Successively merge nodesKeep leading eigenvalueKeep same set of labels
Graph summarization Segmentation graph Nodes Edges Edge weights
ALP
SnapNETS Result