Spectral Graph Theory (Basics)

CMU SCS

Spectral Graph Theory (Basics)

Charalampos (Babis) TsourakakisCMU

Charalampos E. Tsourakakis

CMU SCS

Outline• Reminders • Adjacency matrix

– Intuition behind eigenvectors: Bipartite Graphs– Walks of length k– Case Study: Triangles

• Laplacian– Connected Components– Intuition: Adjacency vs. Laplacian– Sparsest Cut and Cheeger Inequality :

• Derivation, intuition• Example

• Normalized Laplacian


CMU SCS

Matrix Representations of G(V,E)Associate a matrix to a graph:• Adjacency matrix• Laplacian• Normalized Laplacian


Main focus

CMU SCS

Matrix as an operator


e.g.,The image of the unit circle (sphere) under any mxn matrix is an ellipse (hyperellipse).

CMU SCS

More Reminders• Let M be a symmetric nxn matrix.


λ eigenvaluex eigenvector

CMU SCS

More Reminders

1-Dimensional Invariant Subspaces


y x

Ay

Ax (λ,u)

Diagonal:No rotation

CMU SCS

Keep in mind!• For the rest of slides we are talking for

square nxn matrices

and unless noticed symmetric ones, i.e,


nnn

n

mm

mmM

1

111

...

TMM

CMU SCS

Outline• Reminders

• Adjacency matrix– Intuition behind eigenvectors: Bipartite Graphs– Walks of length k– Case Study: Triangles

• Laplacian– Connected Components– Intuition: Adjacency vs. Laplacian– Cheeger Inequality and Sparsest Cut:




CMU SCS

Adjacency matrix


A=1

2 3

4

Undirected

CMU SCS

Adjacency matrix


A=1

2 3

410

0.3

2

4

Undirected Weighted

CMU SCS

Adjacency matrix


1

2 3

4

ObservationIf G is undirected,A = AT

Directed

CMU SCS

Spectral TheoremTheorem [Spectral Theorem]• If M=MT, then

where


Tnnn

T

Tn

T

n

n xxxxx

xxxM

111

11

1 ......

xi xj

λjλ i Reminder 1:

xi,xj orthogonal

Reminder 2:xi i-th principal axisλi length of i-th principal axis

0

0

CMU SCS


• Adjacency matrix– Intuition behind eigenvectors: Bipartite Graphs– Walks of length k– Case Study: Triangles





CMU SCS

Bipartite Graphs


Any graph with no cycles of odd length is bipartite

e.g., all trees are bipartite

Can we check if a graph is bipartitevia its spectrum?

Can we get the partition of the verticesin the two sets of nodes?

1 4

2 5

3 6

K3,3

CMU SCS

Bipartite Graphs


Adjacency matrix

where

Eigenvalues:

1 4

2 5

3 6

K3,3

Λ=[3,-3,0,0,0,0] Why λ1=-λ2=3?Recall: Ax=λx, (λ,x) eigenvalue-eigenvector

CMU SCS

Bipartite Graphs


1 4

2 5

3 6

11

1

1

1

1

1 4

5

6

1 1

1

1

233=3x1

Repeat same argument for the other nodes

CMU SCS

Bipartite Graphs


1 4

2 5

3 6

11

1

-1

-1

-1

1 4

5

6

-1 -1

-1

-1

-2-3-3=(-3)x1

Repeat same argument for the other nodes

CMU SCS

Bipartite Graphs• Observation

u2 gives the partition of the nodes in the two sets S, V-S!


S V-S

Theorem: λ2=-λ1 iff G bipartite. u2 gives the partition.

Question: Were we just “lucky”? Answer: No

CMU SCS


• Adjacency matrix– Intuition behind eigenvectors: Bipartite Graphs

–Walks of length k– Case Study: Triangles





CMU SCS

Walks• A walk of length r in a directed graph:

where a node can be used more than once.• Closed walk when:


1

2 3

4

1

2 3

4

Walk of length 2 2-1-4

Closed walk of length 32-1-3-2

CMU SCS

Walks• Theorem

G(V,E) directed graph, adjacency matrix A. The number of walks from node u to node v in G with length r is (Ar)uv

• Proof Induction on k. See Doyle-Snell, p.165


r

ijr

ijij aAaAaA .., , , 221

(i,j) (i, i1),(i1,j) (i,i1),..,(ir-1,j)

CMU SCS

Walks


1

2 3

4

1

2 3

4

1

2 3

4i=2, j=4 i=3, j=3

CMU SCS

Walks


1

2 3

4

1

3

4

2

Always 0,node 4 is a sink

CMU SCS

Walks• Corollary A adjacency matrix of undirected G(V,E)

(no self loops), e edges and t triangles. Then the following hold:a) trace(A) = 0 b) trace(A2) = 2ec) trace(A3) = 6t


11 2

1 2

3

Computing Ar is a bad idea:High memory requirements,expensive!

CMU SCS


• Adjacency matrix– Intuition behind eigenvectors: Bipartite Graphs– Walks of length k

–Case Study: Triangles• Laplacian

– Connected Components– Intuition: Adjacency vs. Laplacian– Sparsest Cut and Cheeger Inequality :




CMU SCS

Why is Triangle Counting important?From the Graph Mining Perspective

• Clustering coefficient • Transitivity ratio

• Social Network Analysis fact:

“Friends of friends are friends”

A

CB

Other applications include:• Hidden Thematic Structure of the Web• Motif Detection, e.g., biological networks• Web Spam Detection


CMU SCS

Theorem

δ(G) = # triangles in graph G(V,E) = eigenvalues of adjacency

matrix A

||

1

3)(6V

iiG

||21 ... V

Theorem [EigenTriangle]


CMU SCS

Theorem

δ(i) = #Δs vertex i participates at. = i-th eigenvector = j-th entry of

2||

1

3)(2 ij

V

jjui

ijuiu

iu

Theorem[EigenTriangleLocal]


CMU SCS

• Almost symmetric around 0!

Political BlogsOmit!

Keep only 3!

3

Algorithm’s idea


CMU SCS

Observe the trend


Results: Edges vs. Speedup

CMU SCS

2-3 eigenvaluesalmost ideal results!


#Eigenvalues vs. ϱ

CMU SCS







CMU SCS

Laplacian


L= D-A=1

2 3

4

Diagonal matrix, dii=di

CMU SCS

Weighted Laplacian


1

2 3

410

0.3

2

4

CMU SCS



• Laplacian–Connected Components– Intuition: Adjacency vs. Laplacian– Sparsest Cut and Cheeger Inequality :




CMU SCS

Connected Components• Lemma

Let G be a graph with n vertices and c connected components. If L is the Laplacian of G, then rank(L)=n-c.

• Proof see p.279, Godsil-Royle


CMU SCS

Connected Components


G(V,E)L=

eig(L)=

#zeros = #components

1 2 3

6

7 5

4

CMU SCS

Connected Components


G(V,E)L=

eig(L)=

#zeros = #components

1 2 3

6

7 5

40.01

Indicates a “good cut”

CMU SCS



• Laplacian– Connected Components

– Intuition: Adjacency vs. Laplacian– Sparsest Cut and Cheeger Inequality :




CMU SCS

Adjacency vs. Laplacian Intuition


V-SLet x be an indicator vector: S

SixSix

i

i

if ,0 if ,1

Consider now y=Lx k-th coordinate

CMU SCS



Consider now y=Lx

G30,0.5

S

k

CMU SCS



Consider now y=Lx

G30,0.5

S

kk

Laplacian: connectivity, Adjacency: #paths

CMU SCS







CMU SCS

Why Sparse Cuts?• Clustering, Community Detection

• And more: Telephone Network Design, VLSI layout, Sparse Gaussian Elimination, Parallel Computation


1

2 3

4

5

6 7

8

9

cut

CMU SCS

Quality of a Cut• Edge expansion/Isoperimetric number φ


1

2 3

4

CMU SCS

Quality of a Cut• Edge expansion/Isoperimetric number φ


1

2 3

4

and thus

CMU SCS

Why λ2?


SixSix

i

i

if ,0 if ,1

Characteristic Vector x

Then:

SV-S

Edges across cut

CMU SCS

Why λ2?


1

2 3

4

5

6 7

8

9

cut

x=[1,1,1,1,0,0,0,0,0]T

S V-S

xTLx=2

CMU SCS

Why λ2?


Ratio cut

Sparsest ratio cut

NP-hard

Relax the constraint:

Normalize: ?

CMU SCS

Why λ2?


Sparsest ratio cut

NP-hard

Relax the constraint:

Normalize:λ2

because of the Courant-Fisher theorem (applied to L)

CMU SCS

Why λ2?


Each ball 1 unit of mass

Eigenvector value

Node idxLx x1 xnOSCILLATE

CMU SCS

Why λ2?


Fundamental mode of vibration: “along” the separator

CMU SCS

Cheeger Inequality


• Step 1: Sort vertices in non-decreasing order according to their assigned by the second eigenvector value.

• Step 2: Decide where to cut. • Bisection• Best ratio cut Two common heuristics

CMU SCS




• Derivation, intuition

• Example• Normalized Laplacian


CMU SCS

Example: Spectral Partitioning


• K500 • K500

dumbbell graph

A = zeros(1000); A(1:500,1:500)=ones(500)-eye(500); A(501:1000,501:1000)= ones(500)-eye(500); myrandperm = randperm(1000);B = A(myrandperm,myrandperm);

In social network analysis, such clusters are called communities

CMU SCS

Example: Spectral Partitioning• This is how adjacency matrix of B looks


spy(B)

CMU SCS

Example: Spectral Partitioning• This is how the 2nd eigenvector of B looks

like.


L = diag(sum(B))-B;[u v] = eigs(L,2,'SM');plot(u(:,1),’x’)

Not so much information yet…

CMU SCS

Example: Spectral Partitioning• This is how the 2nd eigenvector looks if we

sort it.


[ign ind] = sort(u(:,1));plot(u(ind),'x')

But now we seethe two communities!

CMU SCS

Example: Spectral Partitioning• This is how adjacency matrix of B looks

now


spy(B(ind,ind))

Cut here!Community 1

Community 2Observation: Both heuristics are equivalent for the dumbell

CMU SCS







CMU SCS

Where does it go from here?• Normalized Laplacian

– Ng, Jordan, Weiss Spectral Clustering – Laplacian Eigenmaps for Manifold Learning– Computer Vision and many more

applications…


Standard reference: Spectral Graph TheoryMonograph by Fan Chung Graham

CMU SCS

Why Normalized Laplacian


• K500 • K500

The onlyweightededge!

Cut hereφ=φ=

Cut here>

So, φ is not good here…

CMU SCS

Why Normalized Laplacian


• K500 • K500

The onlyweightededge!

Cut hereφ=φ=

Cut here>Optimize Cheeger

constant h(G),balanced cuts

where

CMU SCS

Conclusions• Spectrum tells us a lot about the graph.• What to remember

– What is an eigenvector (f:Nodes Reals)– Adjacency: #Paths– Laplacian: Sparsest Cut and Intuition– Normalized Laplacian: Normalized cuts, tend to

avoid unbalanced cuts


CMU SCS

References• A list of references is on my web site, in the

KDD tutorial web page www.cs.cmu.edu/~ctsourak/


http://www.cs.cmu.edu/~ctsourak/kdd09.htm

CMU SCS

Thank you!


Spectral Graph Theory (Basics)

Documents

Transcript of Spectral Graph Theory (Basics)