IJCAI13 Paper review: Large-scale spectral clustering on graphs

19
Paper digest “Large-Scale Spectral Clustering on Graphs” Akisato Kimura [email protected], @_akisato

description

Large-scale spectral clustering by introducing super-nodes, which might be inspired by co-clustering.

Transcript of IJCAI13 Paper review: Large-scale spectral clustering on graphs

Page 1: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Paper digest “Large-Scale Spectral Clustering on Graphs”

Akisato Kimura [email protected], @_akisato

Page 2: IJCAI13 Paper review: Large-scale spectral clustering on graphs

One-page abstract

• Approx. acceleration of spectral clustering

– by introducing additional nodes that enable us to compress the original graph,

– resulting in a bipartite graph which is computationally efficient for spectral clustering.

• Note

– Large-scale spectral clustering, especially works well for dense graphs.

– Not suitable for large-scale graph clustering, due to the sparsity in nature.

Page 3: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Spectral clustering [Shi & Malik 1997]

• Notations

– Undirected weighted graph 𝐺 = 𝑉, 𝐸

– Num. nodes 𝑛 = |𝑉|; Num. Edges 𝑚 = |𝐸|

– Adjacency matrix 𝑊 = 𝑊𝑖,𝑗 𝑖,𝑗=1,2,…,𝑛

• Objective function

– Solved by eigen-decomposition (EVD)

min𝑋∈ℝ𝑘×𝑛

𝑇𝑟(𝑋𝑇𝐷−1/2𝐿𝐷−1/2𝑋) s.t. 𝑋𝑇𝑋 = 𝐼

(𝐿: graph Laplacian of 𝑊, 𝐷 = 𝐿 −𝑊, 𝑘: num.clusters)

Page 4: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Main contribution of this work

• SC needs 𝑂(𝑛3) computations due to EVD.

• Several improvements so far.

– Compressing the adjacency matrix by Nystrom method [Fowlkes+ 2004]

– Reducing samples (= nodes) [Shinnou & Sasaki 2008] [Yan+

2009] [Sakai & Imiya 2009] [Chen & Cai 2011]

– Early stopping of EVD [Chen+ 2006] [Liu+ 2007]

• In contrast, this work

– Reducing the size of the graph.

Page 5: IJCAI13 Paper review: Large-scale spectral clustering on graphs

• Why supernodes? --- Intuition from co-clustering

– A partition of supernodes can induce a partition of the observed nodes, and vise versa.

• Generating a set of 𝑑 ≪ 𝑛 supernodes

Introducing supernodes

Original graph Regular nodes

Supernodes

Page 6: IJCAI13 Paper review: Large-scale spectral clustering on graphs

How to generate supernodes

1. Randomly choosing 𝑑 regular nodes as seeds.

2. Calculating the shortest paths from the seeds to the other regular nodes.

i. Converting adjacencies to distances.

ii. Applying Dijkstra’s algorithm.

3. Partitioning all the regular nodes into 𝑑 disjoint subsets based on the shortest paths.

4. (Each subset corresponds to a supernode.)

Page 7: IJCAI13 Paper review: Large-scale spectral clustering on graphs

After generating supernodes

𝑛 regular nodes

𝑑 supernodes

𝑊

𝑅

𝑊 = 𝑅𝑊

𝑅 ∈ ℤ𝑑×𝑛: binary bipartite graph 𝑊 ∈ ℝ𝑑×𝑛: bipartite, called a “reduced graph”

𝑊 Propagating edge weights between regular nodes and supernodes

Page 8: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Spectral clustering on reduced graphs

• Consider another representation of the reduced graph

• Spectral clustering on 𝑊′

𝑛 regular nodes

𝑑 supernodes

𝑛 regular nodes

𝑑 supernodes

Result of spectral clustering on 𝑊′

Page 9: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Spectral clustering on reduced graphs

• Spectral clustering on 𝑊′ becomes

• It can be more simplified

– 𝑦 is also an eigenvector of 𝑍𝑍𝑇 ∈ ℝ𝑑×𝑑

𝑛 regular nodes

𝑑 supernodes

• Co-clustering structure • 𝑥 and 𝑦 are left & right

singular vectors of 𝑍 ∈ ℝ𝑑×𝑛.

∵ 𝑍𝑍𝑇𝑦 = 𝑍 1 − 𝜆 𝑥 = 1 − 𝜆 2𝑦

(𝑍𝑍𝑇 looks like a compressed representation of 𝑊.)

Page 10: IJCAI13 Paper review: Large-scale spectral clustering on graphs

In summary

Described by now

Additional steps

Page 11: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Regenerating supernodes

• Intuitions

1. The matrix 𝑈 ∈ ℝ𝑛×𝑘 implies the current clustering.

2. Most of the nodes in the same cluster expect to be densely connected.

• Method

– Selecting 𝑘 − 1 right (= with large eigenvalues) vectors as supernodes. 𝑈

𝑛 regular nodes

𝑑 supernodes

𝑘 cluster nodes

𝑊

Page 12: IJCAI13 Paper review: Large-scale spectral clustering on graphs

In detail

New regular-super links

Average affiliation score over all the samples.

• Resulting in (𝑘 − 1) edges from every regular node. • Every edge stands for a binalized affiliation score • So, this idea can be easily extended to quantized affiliation scores with arbitrary sizes

Page 13: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Finally, the algorithm is as follows

Generating or updating supernodes

Small-size spectral clustering

can be replaced to a function of 𝑡 as 𝑙𝑡

Page 14: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Computational costs

3-4. 𝑂(𝑚𝑑)

1-2. 𝑂(𝑛𝑑 log𝑛)

6. 𝑂 𝑛𝑑2 + 𝑂(𝑑3)

7-9. 𝑂(𝑛𝑑𝑘)

Alg. 1: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑛𝑑2)

5. 𝑂(𝑛𝑑)

3. 𝑂(𝑛𝑑 log 𝑛 + 𝑚(𝑑 + 1))

5. 𝑂(𝑚𝑘)

Alg. 2: 𝑂(𝑚𝑘)

Alg. 3: 𝑂(𝑛𝑑 log𝑛 + 𝑚𝑑 +𝑚𝑘𝑡 + 𝑛(𝑑2 + 𝑘2𝑡))

If 𝑑2 ≈ 𝑘2𝑡 ≈ log2 𝑛 → 𝑂 𝑛 log2 𝑛 ( = modularity-based clustering)

Page 15: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Data sets for experiments

• 2 synthetic, 2 real-world.

– Syn-1k: kNN graph; 100k: 100-ins & 40-outs

– DBLP: Author network, co-conference links.

– IMDB: Movie network, co-director links.

• Looks like moderate-scale (not large-scale) graphs…

Page 16: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Experimental results

Shortest Path (See Slide 6) Proposed (Alg. 1) Proposed (Alg. 3)

Spectral Clustering [Khoa & Chawla 2012]

[Fowlkes+ 2004]

The proposed method is suitable for dense graphs. (if sparse, modularity-based clustering would be better (𝑂 𝑛 log𝑛 ∼ 𝑂(𝑛 log2 𝑛)) )

Page 17: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Detailed results

Performance of the proposed methods w.r.t parameter 𝑑 (num.supernodes). Why not monotonically increasing?

Performance of the proposed methods w.r.t parameter 𝑡 (num.iterations).

Page 18: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Qualitative evaluations

• Toy example on Syn-1K

Ground truth k-NN graph SP Proposed 1

Proposed 2 (5 iterations)

SC RESC Nystrom

Page 19: IJCAI13 Paper review: Large-scale spectral clustering on graphs

Comments

• The idea and technique are interesting and maybe versatile.

• (Serialized and parallel) implementation would be quite simple.

– Matlab code is available at http://jialu.cs.illinois.edu/publication

• Might be suitable only for dense graph clustering (with features).