Graph Wavelets and Multiscale Community Mining in networks
Transcript of Graph Wavelets and Multiscale Community Mining in networks
![Page 1: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/1.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Graph Wavelets and Multiscale CommunityMining in networks
Pierre Borgnat, Nicolas Tremblay
CR1 CNRS – Laboratoire de Physique, ENS de Lyon, Université de Lyon
Équipe SISYPHE : Signaux, Systèmes et Physique
08/2014
p. 1
![Page 2: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/2.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Content of the talk
• General objective: revisit the classical question of findingcommunities in networks using multiscale processingmethods on graphs.
• The things that will be discussed:
1. Recall the notion of community in networks2. Recall spectral graph wavelets3. Multiscale community mining with graph wavelets
p. 2
![Page 3: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/3.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Examples of networks from our digital world
LinkedIn Network Citation Graph Vehicle Network
USA Power grid Web Graph Protein Network
p. 3
![Page 4: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/4.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Communities in networks
• Observed, real-world, networks are often inhomogeneous,made of communities (or modules):groups of nodes having a larger proportion of links insidethe group than with the outside
• This is observed in various types of networks: social,technological, biological,...
• There exist several extensive surveys:
[S. Fortunato, Physic Reports, 2010]
[von Luxburg, Statistics and Computating, 2007]
...
p. 4
![Page 5: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/5.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Purpose of community detection?
p. 5
![Page 6: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/6.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Purpose of community detection?
p. 5
![Page 7: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/7.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Purpose of community detection?
1) It gives us a sketch of the network:
p. 6
![Page 8: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/8.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Purpose of community detection?1) It gives us a sketch of the network:
2) It gives us intuition about its components:
p. 6
![Page 9: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/9.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Some examples of networks with communities ormodules
• Social face-to-face interaction networks [Sociopatterns;Barrat, Cattuto, et al.]
(Lab. physique, ENSL, 2013) (école primaire; Sociopatterns, 2011)
• Brain networks [Bullmore, Achard, 2006]
10 neurons11
fMRI10 voxels
0.3 Hz5
Parcellation
Time series
Connectivityusing wavelets
Graphs of cerebral connections
GRAPHSIP j t h ll
p. 7
![Page 10: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/10.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Classical methods to find communities in networks• I will not pretend to make a full survey... Some important
steps are:• Cut algorithms (legacy from computer science)• Spectral clustering (relaxed cut problem)• Modularity optimization (physicists’ contribution) [Newman,
Girvan , 2004]• Greedy modularity optimization a la Louvain (computer
science strikes back) [Blondel et al., 2008]
• Using information compression [Rosvall, Bergstrom, 2008]• Inference for stochastic-block models (e.g. with BP
[Decelle et al., 2012]; with spectral approach [Lelarge,Massoulié,... 2012, 2014])
p. 8
![Page 11: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/11.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Parenthesis: Stochastic Block Model• Representation: as a matrix, as a network
• Conjectured phase diagram of identifiability
[Decelle, Krzakala, Moore, Zdeborova, 2011][Lelarge, Massoulié, Xu, 2013]p. 9
![Page 12: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/12.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Spectral analysis of networksSpectral theory for networkThis is the study of graphs through the spectral analysis(eigenvalues, eigenvectors) of matrices related to the graph:the adjacency matrix, the Laplacian matrices,....
NotationsG = (V ,E ,w) a weighted graph
N = |V | number of nodesA adjacency matrix Aij = wijd vector of strengths di =
∑j∈V wij
D matrix of strengths D = diag(d)f signal (vector) defined on V
p. 10
![Page 13: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/13.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Definition of the Laplacian matrix of graphs
Laplacian matrixL laplacian matrix L = D − A
(λi) L’s eigenvalues 0 = λ0 < λ1 ≤ λ2 ≤ ... ≤ λN − 1
(χi) L’s eigenvectors Lχi = λi χi
Note: χ0 = 1.
A simple example: the straight line
←→ L =
⎛⎜⎜⎜⎝
...
... −1 0 0 0 0
... 2 −1 0 0 0−1 2 −1 0 00 −1 2 −1 00 0 −1 2 −10 0 0 −1 2 ...0 0 0 0 −1 ...
...
⎞⎟⎟⎟⎠
For this regular line graph, L is the 1-D classical laplacian operator(i.e. double derivative operator).
p. 11
![Page 14: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/14.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Spectral clustering vs. Modularity• Spectral clustering: relaxation of the optimization of the
minimal cut. The cut size between groups of assignment
si = ±1 is: R =12
∑i,j
Aij(1− sisj) =14
s�Ls
• By spectral decomposition of L, Lij =∑N−1
k=1 λk (χk )i(χk )j ,the minimum is for si = (χ1)i → relaxed in si = sign((χ1)i).
• Problems with spectral clustering:1) No assessment of the quality of the partitions2) No reference to comparison to some null hypothesis
• Modularity [Newman, 2003] (with 2m =∑
i di )
Q =1
2m
∑ij
[Aij −
didj
2m
]δ(ci , cj)
• Null model: Bernoulli random graph with prob. di dj2m
• Q is between −1 and +1 (≤ 1− 1/nc if nc groups)p. 12
![Page 15: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/15.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Spectral clustering vs. Modularity
• Comparison of optimization of cut and optimization of Q• Modularity works well, better than spectral clustering
• More efficient algorithm: the greedy (ascending) Louvainapproach (ok for millions of nodes !) [Blondel et al., 2008]
p. 13
![Page 16: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/16.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Existence of multiscale community structure in a graph16 com. Q=0.80
4 com. Q=0.74
8 com. Q=0.83
2 com. Q=0.50
• All representations correct; modularity favours one• Note: one could integrate a ad-hoc scale into modularity
[Arenas et al., 2008; Reichardt and Bornholdt, 2006]p. 14
![Page 17: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/17.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Relating the Laplacian of graphs to Signal Processing
Laplacian matrix
L or L laplacian matrix L = D − A or L = I − D−1/2AD−1/2
(λi) L’s eigenvalues 0 = λ0 < λ1 ≤ λ2 ≤ ... ≤ λN − 1
(χi) L’s eigenvectors Lχi = λi χi
A simple example: the straight line
←→ L =
⎛⎜⎜⎜⎝
...
... −1 0 0 0 0
... 2 −1 0 0 0−1 2 −1 0 00 −1 2 −1 00 0 −1 2 −10 0 0 −1 2 ...0 0 0 0 −1 ...
...
⎞⎟⎟⎟⎠
For this regular line graph, L is the 1-D classical laplacian operator(i.e. double derivative operator):
its eigenvectors are the Fourier vectors, and its eigenvalues theassociated (squared) frequenciesp. 15
![Page 18: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/18.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Objective and Fundamental analogy[Shuman, Vandergheynst et al., IEEE SP Mag, 2013]
Objective: Definition of a Fourier Transform adapted tograph signals
f : signal defined on V ←→ f : Fourier transform of f
Fundamental analogyOn any graph, the eigenvectors χi of the Laplacian matrix L orL will be considered as the Fourier vectors, and itseigenvalues λi the associated (squared) frequencies.
• Works exactly for all regular graphs (+ Beltrami-Laplace)• Conduct to natural generalizations of signal processing
p. 16
![Page 19: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/19.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
The graph Fourier transform
• f is obtained from f ’s decomposition on the eigenvectors χi :
f =
⎛⎜⎜⎜⎜⎝
< χ0, f >< χ1, f >< χ2, f >
...< χN − 1, f >
⎞⎟⎟⎟⎟⎠
Define χ = (χ0|χ1|...|χN − 1) : f = χ� f
• Reciprocally, the inverse Fourier transform reads: f = χ f
• Parseval theorem: ∀(g, h) < g, h >=< g, h >
• Filtering: apply g(λi) in the Fourier domain on the f (i).
p. 17
![Page 20: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/20.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Fourier modes: examples in 1D and in graphsLOW FREQUENCY: HIGH FREQUENCY:
• Alternative Fourier transform: use the adjacency matrix A[Sandryhaila, Moura, IEEE TSP, 2013]
p. 18
![Page 21: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/21.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Spectral analysis: the χi and λi of a multiscale toy graph
Mode #
node
s
20 40 60 80 100 120
20406080
100120 −0.5
0
0.5
0 20 40 60 80 100 120 1400
5
10
15
Mode #
λi
p. 19
![Page 22: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/22.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Spectral Graph Wavelets[Hammond et al., ACHA 2011]
• Fourier is a global analysis. Fourier modes (eigenvectors ofthe laplacian) are used in classical spectral clustering, butdo not enable a jointly local and scale dependent analysis.
• For that classical signal processing (or harmonic analysis)teach us that we need wavelets.
• Wavelets : local functions that act as well as a filter arounda chosen scale.A wavelet:
– Translated:
– Scaled• Classical wavelets
by analogy−−−−−−→ Graph waveletsp. 20
![Page 23: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/23.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Classical waveletsby analogy−−−−−−→ Graph wavelets
Classical (continuous) world Graph world
Real domain x node a
Fourier domain ω eigenvalues λi
Filter kernel ψ(ω) g(λi)⇔ G
Filter bank ψ(sω) g(sλi)⇔ Gs
Fourier modes exp−iωx eigenvectors χi
Fourier transf. of f f (ω) =∫∞−∞ f (x) exp−iωx dx f = χ� f
The wavelet at scale s centered around a is given by:
ψs,a(x) =1sψ
(x − a
s
)=
∫ ∞
−∞δa(ω)ψ(sω) expiωx dω
In the graph world: ψs,a = χ Gsδa = χ Gsχ� δap. 21
![Page 24: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/24.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Examples of graph waveletsA WAVELET:
TRANSLATING: SCALING:
p. 22
![Page 25: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/25.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Examples of wavelets: they encode the local topology
ψs=1,a
ψs=35,a
ψs=25,a
ψs=50,a
p. 23
![Page 26: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/26.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Example of wavelet filters• More precisely, we will use the following kernel:
g(x ;α, β, x1, x2) =
⎧⎪⎨⎪⎩
x−α1 xα for x < x1
p(x) for x1 ≤ x ≤ x2
xβ2 x−β for x > x2.
• To emphasize χ1, the parameters are:
smin =1λ2
, x2 =1λ2
, smax =1λ2
2, x1 = 1, β = 1/log10
(λ3
λ2
)
• This leads to: (choice α = 2)
0 1 20
2
4
6
8
λ
g(sλ
)
s=7
s=13
s=25
s=47
0
2
4
6
8
10
x1
x2 x
g(x)
α=1
α=2
α=10
α=50
p. 24
![Page 27: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/27.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
A new method for multiscale community detection[N. Tremblay, P. Borgnat, 2013]
General Ideas• Take advantage of local topological information encoded in
Graph Wavelets.Wavelet = ego-centered vision from a node
• Group together nodes whose local environments aresimilar at the description scale
• This will naturally offer a multiscale vision of communities
The method is based on:1. wavelets (resp. scaling functions) as feature vectors2. the correlation distance to compare them3. the complete linkage clustering algorithm
p. 25
![Page 28: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/28.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
1) Wavelets as featuresEach node a has feature vector ψs,a.Globally, one will need Ψs, all wavelets at a given scale s, i.e.
Ψs =(ψs,1|ψs,2| . . . |ψs,N
)= χGsχ
�.
NODE
A:
NODE
B:
AT SMALL SCALE: AT LARGE SCALE:
p. 26
![Page 29: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/29.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
2) Correlation distances
Ds(a, b) = 1− ψ�s,aψs,b
||ψs,a||2 ||ψs,b||2.
NODE
A:
NODE
B:CORR.COEF.: -0.50 0.97
p. 27
![Page 30: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/30.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
3) Complete linkage clustering and dendrogram
• Bottom to top hierarchical algorithm:start with as many clusters as nodes and work the way upto fewer clusters (by linking subclusters together) untilreaching one global cluster.
• Computation of the distance between two subclusters:the maximum distance between all pairs of nodes, takingone from each cluster
• Output: a dendrogram
p. 28
![Page 31: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/31.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Example of a dendrogram at a given scale s
0
0.5
1
1.5
2
corr
elat
ion
dist
ance
The big question: where should we cut the dendrogram?
p. 29
![Page 32: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/32.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal gap
Simplest method: cut the dendrogram at its maximal gap.At small scale:
0
0.5
1
1.5
2
corr
elat
ion
dist
ance
At large scale:
0
0.5
1
1.5
2
corr
elat
ion
dist
ance
Note: we use the toy graphp. 30
![Page 33: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/33.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal gap
Simplest method: cut the dendrogram at its maximal gap.At small scale:
0
0.5
1
1.5
2
corr
elat
ion
dist
ance
At large scale:
0
0.5
1
1.5
2
corr
elat
ion
dist
ance
Note: we use the toy graphp. 30
![Page 34: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/34.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal gap
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
scale number
node
s
10 20 30 40
20
40
60
80
100
120
• Improvement: cut at average maximal gap
p. 31
![Page 35: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/35.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
The Sales-Pardo benchmark• Three community structures nested in one another• Parameters:
• sizes of the communities (N = 640)• ρ tunes how well separated the different scales are• k is the average degree; the sparser is the graph, the
harder it is to recover the communities.
p. 32
![Page 36: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/36.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Results on the Sales-Pardo benchmark
10 15.8 25.1 39.80
0.5
1
scale s
Adj
. Ran
d in
dex
Large Scale
Medium Scale
Small Scale
23 1
p. 33
![Page 37: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/37.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
The case of larger networks
• Limit of the method: computation of the N ×N matrix of thewavelets Ψs.
• Improvement: use of random features.• Let r ∈ R
N be a random vector on the nodes of the graph,composed of N independent normal random variables ofzero mean and finite variance σ2.
• Define the feature fs,a ∈ R at scale s associated to node aas
fs,a = ψ�s,ar =
N∑k=1
ψs,a(k)r(k).
p. 34
![Page 38: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/38.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
The case of larger networks• Let us define the correlation between features
Cor(fs,a, fs,b)=E((fs,a − E(fs,a))(fs,b − E(fs,b)))√
Var(fs,a)Var(fs,b).
• It is easy to show that:
Cor(fs,a, fs,b) =ψ�
s,aψs,b
||ψs,a||2 ||ψs,b||2.
• Therefore, the sample correlation estimator Cab,η satisfies:
limη→+∞ Cab,η =
ψ�s,aψs,b
||ψs,a||2 ||ψs,b||2= 1− Ds(a, b).
• This leads to a faster algorithm.
p. 35
![Page 39: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/39.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Results on the Sales-Pardo benchmark
• As a function of η, the number of random vectors used
20 40 60 80 1000
0.5
1
η
<ra
tios>
LS RecallMS RecallSS Recall
20 40 60 80 1003.23.43.63.8
44.2
η
<co
mp.
tim
e> (
sec)
p. 36
![Page 40: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/40.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Stability of the communities• Not all partitions are relevant: only those stable enough
convey information about the network• Lambiotte’s approach to stability:
Create B resampled graphs by randomly adding ±p%(typically p = 10) to the weight of each link and computingthe corresponding B sets of partitions {Pb
s }b∈[1,B],s∈S .Then, stability:
γr (s) =2
B(B − 1)
∑(b,c)∈[1,B]2,b �=c
ari(Pbs ,P
cs ), (1)
• New approach: we have a stochastic algorithm.Consider J sets of η random signals and compute theassociated sets of partitions {Pj
s}j∈[1,J],s∈S . Let stability be:
γa(s) =2
J(J − 1)
∑(i,j)∈[1,J]2,i �=j
ari(Pis,P
js). (2)
p. 37
![Page 41: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/41.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Results with stabilities on the Sales-Pardo benchmark
10 15.8 25.1 39.80
0.5
1
scale s
Adj
. Ran
d in
dex
Large Scale
Medium Scale
Small Scale
23 1
10 15.8 25.1 39.80
0.5
1
scale s
Inst
abili
ties
1−γr
1−γa
123
p. 38
![Page 42: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/42.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
In addition: statistical test of relevance of thecommunities
• It is possible to design a data-driven test on γa(computation of a numerical threshold for the configuration(or Chung-Lu) model).
• Result: threshold for 1− γa above which the partition incommunities is irrelevant.
Sales-Pardo graph Chung-Lu graph
10 15.8 25.1 39.80
0.5
1
scale s
1−γa
3 2 1
1.9 2.2 2.50
0.5
1
scale s
1−γa
p. 39
![Page 43: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/43.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Comparison on larger Sales-Pardo graphs
N = 6400 nodes
Schaub-Delvenne
0 0.1 10
0.5
1
Markov time
Adj
. Ran
d in
dex
LSMSSS
0 0.1 10
0.5
1
Markov time
Var
. Inf
orm
atio
n
Wavelets
6.3 10 15.8 25.10
0.5
1
scale s
Adj
. Ran
d in
dex
LSMSSS
6.3 10 15.8 25.10
0.5
1
scale s1−
γa
p. 40
![Page 44: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/44.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Sensor network on the swiss roll manifold• Three scale ranges of relevant community structure
10000 100000 1e+06 1e+070
0.1
0.2
0.3
0.4
scale s
1−γa
p. 41
![Page 45: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/45.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
The dynamic social network of a primary school
Collaboration with A. Barrat (CPT Marseille), C. Cattuto (ISI, Turin)Sociopatterns project
• Acquisition of face-to-face human contacts (resolved intime) using active RFID tags and + fixed antenna
• Interest: social studies, spreading processes (ofinformation, of epidemic,...), contact dynamics,...
• Time for a movie!p. 42
![Page 46: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/46.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Multi-scale Communities in Primary School
scale s20 28 37 51 74 103
1st a1st b
2nd a2nd b3rd a3rd b4th a4th b5th a5th b
20 28 37 51 74 1030
0.5
1
scale s1−
γa
p. 43
![Page 47: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/47.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Multi-scale Communities in Primary School
p. 44
![Page 48: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/48.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Conclusion• Wavelet ψs,a gives an ”egocentered view“ of the network
seen from node a at scale s• Correlation between these different views gives us a
distance between nodes at scale s• This enables multi-scale clustering of nodes in
communities• Associated to a notion of stability and of statistical
detection of relevance
• I hope also that you were interested inthe emerging field of graph signal processing for networks.
http://perso.ens-lyon.fr/pierre.borgnat
Acknowledgements: thanks to Nicolas Tremblay for borrowingmany of his figures or slides.
p. 45
![Page 49: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/49.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
A toy graph for introducing the methodsmallest scale (16 com.): small scale (8 com.):
medium scale (4 com.): large scale (2 com.):
p. 46
![Page 50: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/50.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut with prior knowledgeLet us cheat by using prior knowledge on the number ofcommunities we are looking for.If we cut each dendrogram in two clusters
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
Using wavelets as featuresConclusion: the dendrograms at different scales contain thecommunity structure at various scales.
p. 47
![Page 51: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/51.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut with prior knowledgeLet us cheat by using prior knowledge on the number ofcommunities we are looking for.If we cut each dendrogram in four clusters
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
Using wavelets as featuresConclusion: the dendrograms at different scales contain thecommunity structure at various scales.
p. 47
![Page 52: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/52.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut with prior knowledgeLet us cheat by using prior knowledge on the number ofcommunities we are looking for.If we cut each dendrogram in eight clusters
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
Using wavelets as featuresConclusion: the dendrograms at different scales contain thecommunity structure at various scales.
p. 47
![Page 53: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/53.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut with prior knowledgeLet us cheat by using prior knowledge on the number ofcommunities we are looking for.If we cut each dendrogram in sixteen clusters
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
Using wavelets as featuresConclusion: the dendrograms at different scales contain thecommunity structure at various scales.
p. 47
![Page 54: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/54.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut with prior knowledgeLet us cheat by using prior knowledge on the number ofcommunities we are looking for.The four levels of communities.
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
Using wavelets as features
Conclusion: the dendrograms at different scales contain thecommunity structure at various scales.
p. 47
![Page 55: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/55.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut with modularity• By max. of with classical modularity Q• or by max. of a filtered modularity [Arenas, Delvenne,...]
Classical Modu Opt. Filtered Modu Opt.
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
scale number
AR
coe
ffici
ent
• The solutions are not really good at all scale.
p. 48
![Page 56: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/56.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal gap: non robust to outliers
0
0.5
1
1.5
corr
elat
ion
dist
ance
nodes
p. 49
![Page 57: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/57.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal average gap
0 0.2 0.4 0.6 0.8 10
0.5
1
correlation distance
Γa
Γ =1
Nmax(corr. dist.)
∑a∈V
Γa
At small scale
0 0.2 0.4 0.6 0.8 10
0.5
1
correlation distance
Γ
0
0.5
1co
rrel
atio
n di
stan
ce
nodes
p. 50
![Page 58: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/58.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal average gap
0 0.2 0.4 0.6 0.8 10
0.5
1
correlation distance
Γa
Γ =1
Nmax(corr. dist.)
∑a∈V
Γa
At larger scale
0 0.5 1 1.50
0.5
1
correlation distance
Γ
0
0.5
1
1.5co
rrel
atio
n di
stan
ce
nodes
p. 51
![Page 59: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/59.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Dendrogram cut at maximal average gap
For the previous graph:
0 0.5 10
0.5
1
correlation distance
Γ
p. 52
![Page 60: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/60.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
Recall: The Adjusted Rand IndexLet:
• C and C′ be two partitions we want to compare.• a be the # of pairs of nodes that are in the same
community in C and in the same community in C′• b be the # of pairs of nodes that are in different
communities in C and in different communities in C′• c be the # of pairs of nodes that are in the same
community in C and in different communities in C′• d be the # of pairs of nodes that are in different
communities in C and in the same community in C′
a + b is the number of “agreements“ between C and C′.c + d is the number of “disagreements“ between C and C′.
p. 53
![Page 61: Graph Wavelets and Multiscale Community Mining in networks](https://reader030.fdocuments.us/reader030/viewer/2022032902/623fcdff0ef4c74e5e4a75bb/html5/thumbnails/61.jpg)
Introduction Communities in networks Wavelets on graphs Multiscale community mining Conclusion +
The Adjusted Rand Index
The Rand index, R, is:
R =a + b
a + b + c + d=
a + b(n2
)The Adjusted Rand index AR is the corrected-for-chanceversion of the Rand index:
AR =R − ExpectedIndex
MaxIndex − ExpectedIndex
p. 54