Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph...

22
spectral clustering between friends

Transcript of Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph...

Page 1: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

spectral clustering between friends

Page 2: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

One of these things is not like the other…

Page 3: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.
Page 4: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

spectral clustering (a la Ng-Jordan-Weiss)

data similarity graphedges have weights w(i,j)

e.g.

Page 5: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

the Laplacian

diagonal matrix D

Normalized Laplacian:

Page 6: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

energy

Normalized Laplacian:

Page 7: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

spectral embedding

Normalized Laplacian:

Compute first k eigenvectors: v1, v2 , …, vk

Page 8: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

clustering

Run k–means to cluster the points

Page 9: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

spectral clustering

Sidi, et. al. 2011 [TelAviv-SFU]

Many, many variants…

it’s amazin

g!it’s mediocre!

it’s antiquated

Many opinions

… what to prove?

Page 10: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

why should spectral clustering work?

spectral embedding

k perfect clusters

Page 11: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

graph expansion

Expansion: For a subset S µ V, define

E(S) = set of edges with one endpoint in S.

S

Page 12: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

graph expansion

Expansion: For a subset S µ V, define

E(S) = set of edges with one endpoint in S.

S1

Theorem [Cheeger70, Alon-Milman85, Sinclair-Jerrum89]:

¸2

2· ½G (2) ·

p2̧ 2

½G (k) = minfmaxÁ(Si ) : S1;S2; : : : ;Sk µ V disjointg

k-way expansion constant:

S2

S3

S4

“most important result in spectral graph theory” -- Wikipedia

Page 13: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

Miclo’s conjecture

Higher-order Cheeger Conjecture [Miclo 08]:

¸k

2· ½G (k) · C(k)

p¸k

for some C(k) depending only on k.

For every graph G and k 2 N, we have

[Lee-OveisGharan-Trevisan 2012]:

True with

This bound for C(k) is tight.

Algorithm of Ng-Jordan-Weiss works, changing the last step.

S1

S2

S3

S4

Page 14: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

the clustering step

Run k–means to cluster the points

we do random projection

random space partition

Page 15: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

Miclo’s conjecture

Higher-order Cheeger Conjecture [Miclo 08]:

¸k

2· ½G (k) · C(k)

p¸k

for some C(k) depending only on k.

For every graph G and k 2 N, we have

[Lee-OveisGharan-Trevisan 2012]:

True with

This bound for C(k) is tight.

Algorithm of Ng-Jordan-Weiss works, changing the last step.

S1

S2

S3

S4

Page 16: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

hybrid algorithms

Suppose the data has some nice low-dimensional structure

Spectral embedding could losethat information:Back in a high-dimensional space

Page 17: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

hybrid algorithms

Suppose the data has some nice low-dimensional structure

Use spectral embedding distances to deform the data

Do clustering on transformed data set

Page 18: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

unraveling the mysteries of complexity

Page 19: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

the unique games conjecture

Consider linear equations in two variables, modulo a prime p

Variables: x1, x2, …, xn

x12 + x2 = 4

x4 – 3 x7 = 1

x9 + 8 x12 = 9…

If there exists a solution that satisfies 99% of the equations,can you find one that satisfies 10%?

Conjectured to be NP-hard [Khot 2002]

Page 20: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

a spectral attack

Construct a graph with one vertex for every variable, and anedge whenever two variables occur in the same constraint.

x12 + x2 = 4

x4 – 3 x7 = 1

x9 + 8 x12 = 9…A “good” solution to the equations implies a partition of thegraph into p nice clusters!

Page 21: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.

a spectral attack

Higher-order Cheeger Theorem:

For every graph G and k 2 N, we haveS1

S2

S3

S4

Unnecessary for large k:

[Arora-Barak-Steurer 2010]

A better asymptotic dependence would disprove the UGC.

Page 22: Spectral clustering between friends. spectral clustering (a la Ng-Jordan-Weiss) datasimilarity graph edges have weights w ( i, j ) e.g.