Sparsification,of,Graphs,and,Matrices, · Sparsification,of,Graphs,and,Matrices,, DanielA....
Transcript of Sparsification,of,Graphs,and,Matrices, · Sparsification,of,Graphs,and,Matrices,, DanielA....
Sparsification of Graphs and Matrices
Daniel A. Spielman
Yale University
joint work with
Joshua Batson (MIT) Nikhil Srivastava (MSR) Shang-‐Hua Teng (USC)
HUJI, May 21, 2014
Objective of Sparsification:
Approximate any (weighted) graph by a
sparse weighted graph.
Objective of Sparsification:
Approximate any (weighted) graph by a
sparse weighted graph. Spanners -‐ Preserve Distances [Chew ’89] Cut-‐Sparsifiers – preserve wt of edges leaving every set S ⊆ V
[Benczur-‐Karger ’96]
S S
Spectral Sparsification [S-‐Teng]
Approximate any (weighted) graph by a
sparse weighted graph.
Graph Laplacian X
(u,v)2E
wu,v(x(u)� x(v))2G = (V,E,w)
Laplacian Quadratic Form, examples
All edge-‐weights are 1 x : 0
1
0
1
1
Laplacian Quadratic Form, examples
All edge-‐weights are 1 x : 0
1
0
1
1
Sum of squares of differences across edges
x
TLGx =
X
(u,v)2E
(x(u)� x(v))2
=
Laplacian Quadratic Form, examples
All edge-‐weights are 1 x : 0
1
0
1
1
Sum of squares of differences across edges
00
01
= 1
x
TLGx =
X
(u,v)2E
(x(u)� x(v))2
=
Laplacian Quadratic Form, examples
When x is the characteristic vector of a set S, sum the weights of edges on the boundary of S
00
0
1
1
1
S
x
TLGx =
X
(u,v)2Eu2Sv 62S
wu,v
0
Learning on Graphs (Zhu-‐Ghahramani-‐Lafferty ’03)
Infer values of a function at all vertices from known values at a few vertices.
Minimize
Subject to known values
X
(a,b)2E
(x(a)� x(b))2
0
1
X
(a,b)2E
(x(a)� x(b))2
0
1 0.5
0.5
0.625 0.375
Infer values of a function at all vertices from known values at a few vertices.
Minimize
Subject to known values
Learning on Graphs (Zhu-‐Ghahramani-‐Lafferty ’03)
View edges as resistors connecting vertices Apply voltages at some vertices. Measure induced voltages and current flow.
1V
0V
Graphs as Resistor Networks
View edges as resistors connecting vertices Apply voltages at some vertices. Measure induced voltages and current flow.
1V
0V
0.5V
0.5V
0.625V 0.375V
Graphs as Resistor Networks
View edges as resistors connecting vertices Apply voltages at some vertices. Measure induced voltages and current flow. Induced voltages minimize
X
(a,b)2E
(v(a)� v(b))2
Subject to fixed voltages (by battery)
Graphs as Resistor Networks
Graphs as Resistor Networks
Effective Resistance between s and t = potential difference of unit flow
s t
s t
2 1 0
0
0.5
0.5
1
1
1.5 Re↵(s, t) = 1.5
Re↵(s, t) = 2
Laplacian Matrices
x
TLGx =
X
(u,v)2E
wu,v(x(u)� x(v))2
L1,2 =
✓1 �1�1 1
◆
=
✓1�1
◆�1 �1
�
E.g.
LG =X
(u,v)2E
wu,vLu,v
Laplacian Matrices
x
TLGx =
X
(u,v)2E
wu,v(x(u)� x(v))2
LG =X
(u,v)2E
wu,vLu,v
=X
(u,v)2E
wu,v(bu,vbTu,v)
where bu,v = �u � �vSum of outer products
Laplacian Matrices
x
TLGx =
X
(u,v)2E
wu,v(x(u)� x(v))2
LG =X
(u,v)2E
wu,vLu,v
=X
(u,v)2E
wu,v(bu,vbTu,v)
Positive semidefinite
If connected, nullspace = Span(1)
Inequalities on Graphs and Matrices
For graphs and G = (V,E,w) H = (V, F, z)
For matrices and fMM
M 4 fMx
TMx x
T fMx
if for all x
G 4 H if LG 4 LH
Inequalities on Graphs and Matrices
For graphs and G = (V,E,w) H = (V, F, z)
For matrices and fMM
M 4 fMx
TMx x
T fMx
if for all x
G 4 H if LG 4 LH
if LG 4 kLHG 4 kH
Approximations of Graphs and Matrices
For graphs and G = (V,E,w) H = (V, F, z)
M ⇡✏fM
1
1 + ✏M 4 fM 4 (1 + ✏)Mif
LG ⇡✏ LHG ⇡✏ H if
Approximations of Graphs and Matrices
For graphs and G = (V,E,w) H = (V, F, z)
M ⇡✏fM
1
1 + ✏M 4 fM 4 (1 + ✏)Mif
LG ⇡✏ LHG ⇡✏ H if
That is, for all
1
1 + �
P(u,v)2F zu,v(x(u)� x(v))2
P(u,v)2E wu,v(x(u)� x(v))2
1 + �
Implications of Approximation
LH and LG have similar eigenvalues
Solutions to systems of linear equations are similar.
G ⇡✏ H
L+G ⇡✏ L
+H
Boundaries of sets are similar.
Effective resistances are similar.
Spectral Sparsification [S-‐Teng]
so that
For an input graph G with n vertices, find a sparse graph H having edges O(n)
G ⇡✏ H
Why? Solving linear equations in Laplacian Matrices
key part of nearly-‐linear time algorithm use for learning on graphs, maxflow, PDEs, …
Preserve Eigenvectors, Eigenvalues and electrical properties Generalize Expanders Certifiable cut-‐sparsifiers
Approximations of Complete Graphs are Expanders
Expanders: d-‐regular graphs on n vertices (n grows, d fixed) every set of vertices has large boundary random walks mix quickly incredibly useful
Approximations of Complete Graphs are Expanders
Expanders: d-‐regular graphs on n vertices (n grows, d fixed) weak expanders: eigenvalues bounded from 0 strong expanders: all eigenvalues near d
Example: Approximating a Complete Graph
For G the complete graph on n verts. all non-‐zero eigenvalues of LG are n.
For , x ? 1 x
TLGx = n
kxk = 1
Example: Approximating a Complete Graph
For G the complete graph on n verts. all non-‐zero eigenvalues of LG are n.
For , x ? 1 x
TLGx = n
For H a d-‐regular strong expander, all non-‐zero eigenvalues of LH are close to d.
For , x ? 1 x
TLHx ⇠ d
kxk = 1
kxk = 1
Example: Approximating a Complete Graph
For G the complete graph on n verts. all non-‐zero eigenvalues of LG are n.
For , x ? 1 x
TLGx = n
For H a d-‐regular strong expander, all non-‐zero eigenvalues of LH are close to d.
For , x ? 1 x
TLHx ⇠ d
kxk = 1
kxk = 1
n
dH is a good approximation of G
Best Approximations of Complete Graphs
Ramanujan Expanders [Margulis, Lubotzky-‐Phillips-‐Sarnak]
d� 2pd� 1 �(LH) d+ 2
pd� 1
Best Approximations of Complete Graphs
Ramanujan Expanders [Margulis, Lubotzky-‐Phillips-‐Sarnak]
d� 2pd� 1 �(LH) d+ 2
pd� 1
Cannot do better if n grows while d is fixed [Alon-‐Boppana]
Best Approximations of Complete Graphs
Ramanujan Expanders [Margulis, Lubotzky-‐Phillips-‐Sarnak]
Can we approximate every graph this well?
d� 2pd� 1 �(LH) d+ 2
pd� 1
33
Example: Dumbbell
Complete graph
on n vertices Complete graph
on n vertices
1
d-‐regular Ramanujan, times n/d
d-‐regular Ramanujan, times n/d
1
G
H
34
Example: Dumbbell
Kn Kn
d-‐regular Ramanujan, times n/d
d-‐regular Ramanujan, times n/d
1
1
35
Example: Dumbbell
Kn Kn
d-‐regular Ramanujan, times n/d
d-‐regular Ramanujan, times n/d
1
1
G1
G2 G3
H1
H2
H3
G = G1 +G2 +G3
H = H1 +H2 +H3
G1 4 (1 + �)H1
G2 = H2
G3 4 (1 + �)H3
G 4 (1 + �)H
Cut-‐Sparsifiers [Benczur-‐Karger ‘96]
S S
For every G, is an H with edges
for all x � {0, 1}nO(n log n/�2)
1
1 + ✏
x
TLHx
x
TLGx
1 + ✏
Cut-‐approximation is different
k
m� 1
Cut-‐approximation is different
k
m� 1
0 1 2 m … 3
Cut-‐approximation is different
k
m� 1
0 1 2 m … 3
x
TLGx = km+m
2
Need long edge if k < m
Main Theorems for Graphs
For every , there is a s.t.
G = (V,E,w) H = (V, F, z)
|F | � |V | (2 + �)2/�2and G ⇡✏ H
Main Theorems for Graphs
For every , there is a s.t.
G = (V,E,w) H = (V, F, z)
|F | � |V | (2 + �)2/�2and G ⇡✏ H
Within a factor of 2 of the Ramanujan bound
Main Theorems for Graphs
For every , there is a s.t.
G = (V,E,w) H = (V, F, z)
|F | � |V | (2 + �)2/�2and G ⇡✏ H
By careful random sampling, get
(S-‐Srivastava 08)
O(|E| log2 |V | log(1/�))
(Koutis-‐Levin-‐Peng ‘12)
In time
Sparsification by Random Sampling
Assign a probability pu,v to each edge (u,v) Include edge (u,v) in H with probability pu,v. If include edge (u,v), give it weight wu,v /pu,v
E [ LH ] =X
(u,v)2E
pu,v(wu,v/pu,v)Lu,v = LG
Sparsification by Random Sampling
Choose pu,v to be wu,v times the effective resistance between u and v. Low resistance between u and v means there are many alternate routes for current to flow and that the edge is not critical. Proof by random matrix concentration bounds (Rudelson, Ahlswede-‐Winter, Tropp, etc.)
Matrix Sparsification
� �=
� �✓ ◆M BTB
� �=
� �✓ ◆fM
1
(1 + ✏)M 4 fM 4 (1 + ✏)M
Matrix Sparsification
� �=
� �✓ ◆M BTB
� �=
� �✓ ◆fM
1
(1 + ✏)M 4 fM 4 (1 + ✏)M
most se = 0
=P
e sebebTe
=P
e bebTe
Main Theorem (Batson-‐S-‐Srivastava)
For , there exist so that for
M =P
e bebTe se
fM =P
e sebebTe
n(2 + ✏)2/✏2at most are non-‐zero se
M ⇡✏fM
and
Simplification of Matrix Sparsification
1
(1 + ✏)M 4 fM 4 (1 + ✏)M
1
(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I
is equivalent to
Simplification of Matrix Sparsification
1
(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I
ve = M�1/2beSet X
e
vevTe = I
“Decomposition of the identity”
X
e
hu, vei2 = kuk2
Simplification of Matrix Sparsification
1
(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I
ve = M�1/2beSet X
e
vevTe = I
We need X
e
sevevTe ⇡✏ I
Simplification of Matrix Sparsification
1
(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I
ve = M�1/2beSet X
e
vevTe = I
X
e
sevevTe ⇡✏ I
Random sampling sets
We need
pe = kvek2
What happens when we add a vector?
Interlacing
More precisely Characteristic Polynomial:
More precisely Characteristic Polynomial: Rank-‐one update:
pA+vvT =⇣1 +
Pi<v,ui>
2
�i�x
⌘pA
Where Aui = �iui
More precisely Characteristic Polynomial: Rank-‐one update:
pA+vvT =⇣1 +
Pi<v,ui>
2
�i�x
⌘pA
are zeros of
Physical model of interlacing λi = positive unit charges resting at barriers on a slope
Physical model of interlacing
ui is eigenvector is added vector charge on barrier
v
hv, uii2
Physical model of interlacing
Barriers repel eigs.
ui is eigenvector is added vector charge on barrier
v
hv, uii2
Physical model of interlacing
Barriers repel eigs.
gravity
Inverse law repulsion
Examples
Ex1: All weight on u1
v = u1
Ex1: All weight on u1
v = u1
Ex1: All weight on u1
v = u1
Ex1: All weight on u1
Pushed up against next barrier
Gravity keeps resting on barrier
v = u1
Ex2: Equal weight on u1, u2
Ex2: Equal weight on u1, u2
Ex2: Equal weight on u1, u2
Ex3: Equal weight on all u1, u2 , …, un
+1/n
+1/n
+1/n
Ex3: Equal weight on all u1, u2 , …, un
+1/n
+1/n
+1/n
Adding a random ve
Because are decomposition of identity, ve
Ee
⇥PA+vevT
e
⇤=
1 +
1
m
X
i
1
�i � x
!PA
= PA � 1
m
d
dxPA
Ee
hhve, uii2
i= 1/m
Many random ve
Ee
⇥PA+vevT
e(x)
⇤= 1� 1
m
d
dx
PA(x)
Ee1,...,ek
hPve1v
Te1
+···+vekvTek(x)
i=
✓1� 1
m
d
dx
◆k
x
n
Many random ve
Ee
⇥PA+vevT
e(x)
⇤= 1� 1
m
d
dx
PA(x)
Ee1,...,ek
hPve1v
Te1
+···+vekvTek(x)
i=
✓1� 1
m
d
dx
◆k
x
n
Is an associated Laguerre polynomial!
For , roots lie between and
k = n/✏2
(1� ✏)2n
m✏2(1 + ✏)2
n
m✏2
Matrix Sparsification Proof Sketch
Have X
e
vevTe = I Want
X
e
sevevTe ⇡✏ I
✏ ⇡ 2.6
Will do with |{e : se 6= 0}| 6n
All eigenvalues between 1 and 13,
Broad outline: moving barriers
0 n -‐n
Step 1
0 n -‐n
Step 1
0 n -‐n
0 n -‐n
Step 1
0 n -‐n
0
+1/3 +2
-‐n+1/3 n+2
Step 1
0 n -‐n
0
+1/3 +2
-‐n+1/3 n+2
tighter constraint
looser constraint
Step i+1
0
Step i+1
0
+1/3 +2
Step i+1
0
Step i+1
0
+1/3 +2
Step i+1
0
Step i+1
0
+1/3 +2
Step i+1
0
Step i+1
0
Step i+1
0
Step i+1
0
Step 6n
0 … 13n n
Step 6n
0 … n
2.6-approximation with 6n vectors.
13n
Problem
need to show that an appropriate
always exists.
Problem
need to show that an appropriate
always exists.
Is not strong enough for induction
Problems
If many small eigenvalues, can only move one
Bunched large eigenvalues repel the highest one
The Lower Barrier Potential Function
�⇥(A) =P
i1
�i�⇥ = Tr�(A� �I)�1
�
The Lower Barrier Potential Function
�⇥(A) =P
i1
�i�⇥ = Tr�(A� �I)�1
�
��(A) 1 =) �min(A) � ⇥+ 1
No λi within dist. 1 No 2 λi within dist. 2 No 3 λi within dist. 3
.
. No k λi within dist. k
The Lower Barrier Potential Function
�⇥(A) =P
i1
�i�⇥ = Tr�(A� �I)�1
�
��(A) 1 =) �min(A) � ⇥+ 1
The Upper Barrier Potential Function
�u(A) =P
i1
u��i= Tr
�(uI �A)�1
�
The Beginning
0 n-n
The Beginning
0 n-n
Step i+1
0
Step i+1
0
+1/3 +2
Lemma. can always choose so that potentials do not increase
+sevevTe
Step i+1
0
Step i+1
0
Step i+1
0
Step i+1
0
Step 6n
0 … 13n n
Step 6n
0 … n
2.6-approximation with 6n vectors.
13n
Goal
+1/3 +2
Lemma. can always choose so that potentials do not increase
+sevevTe
+sevevTe
Upper Barrier Update
Add and set
+2
svvT
Upper Barrier Update
Add and set
+2
�u0(A+ svvT )
= Tr�(u0I �A� svvT )�1
�
svvT
Upper Barrier Update
Add and set
+2
svvT
�u0(A+ svvT )
= Tr�(u0I �A� svvT )�1
�
= �u0(A) +
svT (u0I �A)�2v
1� svT (u0I �A)�1v
By Sherman-‐Morrison Formula
Upper Barrier Update
Add and set
+2
svvT
�u0(A+ svvT )
= Tr�(u0I �A� svvT )�1
�
= �u0(A) +
svT (u0I �A)�2v
1� svT (u0I �A)�1v
Need �u(A)
How much of can we add?
Rearranging:
�u0(A+ svvT ) �u(A)
1 � svT
✓(u0I �A)�2
�u(A)� �u0(A)+ (u0I �A)�1
◆v
iff
vvT
How much of can we add?
Rearranging:
�u0(A+ svvT ) �u(A)
1 � svT
✓(u0I �A)�2
�u(A)� �u0(A)+ (u0I �A)�1
◆v
iff
Write as 1 � svTUAv
vvT
Lower Barrier
Similarly:
iff
Write as
�l0(A+ svvT ) �l(A)
1 svTLAv
1 svT
✓(A� l0I)�2
�l0(A)� �l(A)� (A� l0I)�1
◆v
Goal
Show that we can always add some vector while respecting both barriers.
+1/3 +2
Need: svTUAv 1 svTLAv
+svvT
Two expectations
Need:
Can show:
svTUAv 1 svTLAv
Ee
⇥vTe UAve
⇤ 3/2m
Ee
⇥vTe LAve
⇤� 2/m
Two expectations
Need:
Can show:
svTUAv 1 svTLAv
So:
And, exists
Ee
⇥vTe UAve
⇤ 3/2m
Ee
⇥vTe LAve
⇤� 2/m
Ee
⇥vTe UAve
⇤ E
e
⇥vTe LAve
⇤
e : vTe UAve vTe LAve
Two expectations
Need:
Can show:
svTUAv 1 svTLAv
So:
And, exists
And that puts between them s 1
Ee
⇥vTe UAve
⇤ 3/2m
Ee
⇥vTe LAve
⇤� 2/m
Ee
⇥vTe UAve
⇤ E
e
⇥vTe LAve
⇤
e : vTe UAve vTe LAve
Two expectations
Need:
Can show:
svTUAv 1 svTLAv
So:
And, exists
And that puts between them s 1
Ee
⇥vTe UAve
⇤ 3/2m
Ee
⇥vTe LAve
⇤� 2/m
Ee
⇥vTe UAve
⇤ E
e
⇥vTe LAve
⇤
e : vTe UAve vTe LAve
Bounding expectations
vTUAv = Tr�UAvv
T�
Ee
⇥Tr
�UAvev
Te
� ⇤= Tr
⇣UA E
e
⇥vev
Te
⇤⌘
= Tr (UAI)
= Tr (UA)
/m
/m
Bounding expectations
Tr (UA) =Tr
�(u0I �A)�2
�
�u(A)� �u0(A)+ Tr
�(u0I �A)�1
�
= �u0(A)
�u(A)
1
As barrier function is monotone decreasing
Bounding expectations
Tr (UA) =Tr
�(u0I �A)�2
�
�u(A)� �u0(A)+ Tr
�(u0I �A)�1
�
= �u0(A)
�u(A)
1
Numerator is derivative of barrier function. As barrier function is convex,
1
u0 � u
Tr(UA) 1/2 + 1 = 3/2
Bounding expectations
Similarly, Tr (LA) �1
l0 � l� 1
=1
1/3� 1
= 2
Step i+1
0
+1/3
Lemma. can always choose so that potentials do not increase
+2
Twice-‐Ramanujan Sparsifiers
Fixing steps and tightening parameters gives ratio
Less than twice as many edges as used by Ramanujan Expander of same quality
�max
(A)
�min
(A) d+ 1 + 2
pd
d+ 1� 2pd
Open Questions
The Ramanujan bound Properties of vectors from graphs?
Faster algorithm union of random Hamiltonian cycles?