Sparsification,of,Graphs,and,Matrices, · Sparsification,of,Graphs,and,Matrices,, DanielA....

Post on 05-Jul-2020

1 views 0 download

Transcript of Sparsification,of,Graphs,and,Matrices, · Sparsification,of,Graphs,and,Matrices,, DanielA....

Sparsification  of  Graphs  and  Matrices    

Daniel  A.  Spielman  

Yale  University            

joint  work  with  

Joshua  Batson  (MIT)  Nikhil  Srivastava  (MSR)  Shang-­‐Hua  Teng  (USC)  

  HUJI,  May  21,  2014  

Objective  of  Sparsification:    

Approximate  any  (weighted)  graph  by  a  

 sparse  weighted  graph.      

Objective  of  Sparsification:    

Approximate  any  (weighted)  graph  by  a  

 sparse  weighted  graph.    Spanners  -­‐  Preserve  Distances  [Chew  ’89]    Cut-­‐Sparsifiers  –  preserve  wt  of  edges  leaving                  every  set  S  ⊆  V    

       [Benczur-­‐Karger  ’96]    

S S

Spectral  Sparsification  [S-­‐Teng]  

Approximate  any  (weighted)  graph  by  a  

 sparse  weighted  graph.  

Graph                                                                    Laplacian  X

(u,v)2E

wu,v(x(u)� x(v))2G = (V,E,w)

Laplacian  Quadratic  Form,  examples  

All  edge-­‐weights  are  1 x : 0

1

0

1

1

Laplacian  Quadratic  Form,  examples  

All  edge-­‐weights  are  1 x : 0

1

0

1

1

Sum  of  squares  of  differences  across  edges

x

TLGx =

X

(u,v)2E

(x(u)� x(v))2

=

Laplacian  Quadratic  Form,  examples  

All  edge-­‐weights  are  1 x : 0

1

0

1

1

Sum  of  squares  of  differences  across  edges

00

01

= 1

x

TLGx =

X

(u,v)2E

(x(u)� x(v))2

=

Laplacian  Quadratic  Form,  examples  

When  x  is  the  characteristic  vector  of  a  set  S, sum  the  weights  of  edges  on  the  boundary  of  S  

00

0

1

1

1

S

x

TLGx =

X

(u,v)2Eu2Sv 62S

wu,v

0

Learning  on  Graphs  (Zhu-­‐Ghahramani-­‐Lafferty  ’03)

Infer  values  of  a  function  at  all  vertices          from  known  values  at  a  few  vertices.  

Minimize    

Subject  to  known  values    

X

(a,b)2E

(x(a)� x(b))2

0

1

X

(a,b)2E

(x(a)� x(b))2

0

1 0.5

0.5

0.625 0.375

Infer  values  of  a  function  at  all  vertices          from  known  values  at  a  few  vertices.  

Minimize    

Subject  to  known  values    

Learning  on  Graphs  (Zhu-­‐Ghahramani-­‐Lafferty  ’03)

View edges as resistors connecting vertices Apply voltages at some vertices. Measure induced voltages and current flow.

1V

0V

Graphs  as  Resistor  Networks  

View edges as resistors connecting vertices Apply voltages at some vertices. Measure induced voltages and current flow.

1V

0V

0.5V

0.5V

0.625V 0.375V

Graphs  as  Resistor  Networks  

View edges as resistors connecting vertices Apply voltages at some vertices. Measure induced voltages and current flow. Induced voltages minimize

X

(a,b)2E

(v(a)� v(b))2

Subject to fixed voltages (by battery)

Graphs  as  Resistor  Networks  

Graphs  as  Resistor  Networks  

Effective Resistance between s and t = potential difference of unit flow

s t

s t

2 1 0

0

0.5

0.5

1

1

1.5 Re↵(s, t) = 1.5

Re↵(s, t) = 2

Laplacian  Matrices  

x

TLGx =

X

(u,v)2E

wu,v(x(u)� x(v))2

L1,2 =

✓1 �1�1 1

=

✓1�1

◆�1 �1

E.g.  

LG =X

(u,v)2E

wu,vLu,v

Laplacian  Matrices  

x

TLGx =

X

(u,v)2E

wu,v(x(u)� x(v))2

LG =X

(u,v)2E

wu,vLu,v

=X

(u,v)2E

wu,v(bu,vbTu,v)

where   bu,v = �u � �vSum  of  outer    products  

Laplacian  Matrices  

x

TLGx =

X

(u,v)2E

wu,v(x(u)� x(v))2

LG =X

(u,v)2E

wu,vLu,v

=X

(u,v)2E

wu,v(bu,vbTu,v)

Positive  semidefinite  

If  connected,  nullspace  =  Span(1)    

Inequalities  on  Graphs  and  Matrices  

For  graphs                                                              and    G = (V,E,w) H = (V, F, z)

For  matrices                  and    fMM

M 4 fMx

TMx x

T fMx

if for  all x

G 4 H if LG 4 LH

Inequalities  on  Graphs  and  Matrices  

For  graphs                                                              and    G = (V,E,w) H = (V, F, z)

For  matrices                  and    fMM

M 4 fMx

TMx x

T fMx

if for  all x

G 4 H if LG 4 LH

if LG 4 kLHG 4 kH

Approximations  of  Graphs  and  Matrices  

For  graphs                                                              and    G = (V,E,w) H = (V, F, z)

M ⇡✏fM

1

1 + ✏M 4 fM 4 (1 + ✏)Mif

LG ⇡✏ LHG ⇡✏ H if

Approximations  of  Graphs  and  Matrices  

For  graphs                                                              and    G = (V,E,w) H = (V, F, z)

M ⇡✏fM

1

1 + ✏M 4 fM 4 (1 + ✏)Mif

LG ⇡✏ LHG ⇡✏ H if

That  is,  for  all    

1

1 + �

P(u,v)2F zu,v(x(u)� x(v))2

P(u,v)2E wu,v(x(u)� x(v))2

1 + �

Implications  of  Approximation  

LH and  LG  have  similar  eigenvalues  

   Solutions  to  systems  of  linear  equations  are  similar.  

G ⇡✏ H

L+G ⇡✏ L

+H

Boundaries  of  sets  are  similar.  

Effective  resistances  are  similar.  

Spectral  Sparsification  [S-­‐Teng]  

     so  that

For  an  input  graph  G  with  n  vertices,          find  a  sparse  graph  H  having              edges      O(n)

G ⇡✏ H

Why?  Solving  linear  equations  in  Laplacian  Matrices  

 key  part  of  nearly-­‐linear  time  algorithm        use  for  learning  on  graphs,  maxflow,  PDEs,  …    

Preserve  Eigenvectors,  Eigenvalues            and  electrical  properties    Generalize  Expanders    Certifiable  cut-­‐sparsifiers    

Approximations  of  Complete  Graphs  are  Expanders    

Expanders:            d-­‐regular  graphs  on  n  vertices  (n grows,  d  fixed)            every  set  of  vertices  has  large  boundary            random  walks  mix  quickly            incredibly  useful            

Approximations  of  Complete  Graphs  are  Expanders    

Expanders:            d-­‐regular  graphs  on  n  vertices  (n grows,  d  fixed)            weak  expanders:  eigenvalues  bounded  from  0        strong  expanders:  all  eigenvalues  near  d

Example:  Approximating  a  Complete  Graph    

For G    the  complete  graph  on  n  verts.            all  non-­‐zero  eigenvalues  of  LG  are  n.  

For                                  ,                                               x ? 1 x

TLGx = n

kxk = 1

Example:  Approximating  a  Complete  Graph    

For G    the  complete  graph  on  n  verts.            all  non-­‐zero  eigenvalues  of  LG  are  n.  

For                                  ,                                               x ? 1 x

TLGx = n

For H    a  d-­‐regular  strong  expander,          all  non-­‐zero  eigenvalues  of  LH  are  close  to  d.  

For                                  ,                                               x ? 1 x

TLHx ⇠ d

kxk = 1

kxk = 1

Example:  Approximating  a  Complete  Graph    

For G    the  complete  graph  on  n  verts.            all  non-­‐zero  eigenvalues  of  LG  are  n.  

For                                  ,                                               x ? 1 x

TLGx = n

For H    a  d-­‐regular  strong  expander,          all  non-­‐zero  eigenvalues  of  LH  are  close  to  d.  

For                                  ,                                               x ? 1 x

TLHx ⇠ d

kxk = 1

kxk = 1

n

dH is  a  good  approximation  of     G

Best  Approximations  of  Complete  Graphs    

Ramanujan  Expanders              [Margulis,  Lubotzky-­‐Phillips-­‐Sarnak]  

d� 2pd� 1 �(LH) d+ 2

pd� 1

Best  Approximations  of  Complete  Graphs    

Ramanujan  Expanders              [Margulis,  Lubotzky-­‐Phillips-­‐Sarnak]  

d� 2pd� 1 �(LH) d+ 2

pd� 1

Cannot  do  better  if n grows  while  d  is  fixed                                                            [Alon-­‐Boppana]

Best  Approximations  of  Complete  Graphs    

Ramanujan  Expanders              [Margulis,  Lubotzky-­‐Phillips-­‐Sarnak]  

Can  we  approximate  every  graph  this  well?  

d� 2pd� 1 �(LH) d+ 2

pd� 1

33  

Example:  Dumbbell    

Complete  graph  

 on  n  vertices  Complete  graph  

 on  n  vertices  

1

d-­‐regular  Ramanujan,  times  n/d  

d-­‐regular  Ramanujan,  times  n/d  

1

G

H

34  

Example:  Dumbbell  

Kn   Kn  

d-­‐regular  Ramanujan,  times  n/d  

d-­‐regular  Ramanujan,  times  n/d  

1

1

35  

Example:  Dumbbell  

Kn   Kn  

d-­‐regular  Ramanujan,  times  n/d  

d-­‐regular  Ramanujan,  times  n/d  

1

1

G1

G2 G3

H1

H2

H3

G = G1 +G2 +G3

H = H1 +H2 +H3

G1 4 (1 + �)H1

G2 = H2

G3 4 (1 + �)H3

G 4 (1 + �)H

Cut-­‐Sparsifiers  [Benczur-­‐Karger  ‘96]  

S S

For  every  G,  is  an  H  with                                                          edges  

for  all    x � {0, 1}nO(n log n/�2)

1

1 + ✏

x

TLHx

x

TLGx

1 + ✏

Cut-­‐approximation  is  different  

k

m� 1

Cut-­‐approximation  is  different  

k

m� 1

0 1 2 m … 3

Cut-­‐approximation  is  different  

k

m� 1

0 1 2 m … 3

x

TLGx = km+m

2

Need  long  edge  if    k < m

Main  Theorems  for  Graphs  

     

For  every                                                      ,  there  is  a                                                          s.t.      

G = (V,E,w) H = (V, F, z)

|F | � |V | (2 + �)2/�2and G ⇡✏ H

Main  Theorems  for  Graphs  

     

For  every                                                      ,  there  is  a                                                          s.t.      

G = (V,E,w) H = (V, F, z)

|F | � |V | (2 + �)2/�2and G ⇡✏ H

Within  a  factor  of  2  of  the  Ramanujan  bound  

Main  Theorems  for  Graphs  

     

For  every                                                      ,  there  is  a                                                          s.t.      

G = (V,E,w) H = (V, F, z)

|F | � |V | (2 + �)2/�2and G ⇡✏ H

By  careful  random  sampling,  get  

   (S-­‐Srivastava  08)  

O(|E| log2 |V | log(1/�))

(Koutis-­‐Levin-­‐Peng  ‘12)    

In  time  

Sparsification  by  Random  Sampling  

Assign  a  probability  pu,v  to  each  edge  (u,v) Include  edge  (u,v) in  H  with  probability  pu,v.  If  include  edge  (u,v),  give  it  weight  wu,v /pu,v

E [ LH ] =X

(u,v)2E

pu,v(wu,v/pu,v)Lu,v = LG

Sparsification  by  Random  Sampling  

Choose  pu,v  to  be  wu,v  times  the              effective  resistance  between  u  and  v.  Low  resistance  between  u  and  v  means  there  are  many  alternate  routes  for  current  to  flow  and  that  the  edge  is  not  critical.    Proof  by  random  matrix  concentration  bounds      (Rudelson,  Ahlswede-­‐Winter,  Tropp,  etc.)

Matrix  Sparsification  

� �=

� �✓ ◆M BTB

� �=

� �✓ ◆fM

1

(1 + ✏)M 4 fM 4 (1 + ✏)M

Matrix  Sparsification  

� �=

� �✓ ◆M BTB

� �=

� �✓ ◆fM

1

(1 + ✏)M 4 fM 4 (1 + ✏)M

most se = 0

=P

e sebebTe

=P

e bebTe

Main  Theorem  (Batson-­‐S-­‐Srivastava)  

For                                                                  ,  there  exist              so  that  for        

   

M =P

e bebTe se

fM =P

e sebebTe

n(2 + ✏)2/✏2at  most                                                                          are  non-­‐zero                   se

M ⇡✏fM

and

Simplification  of  Matrix  Sparsification  

1

(1 + ✏)M 4 fM 4 (1 + ✏)M

1

(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I

is    equivalent    to    

Simplification  of  Matrix  Sparsification  

1

(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I

ve = M�1/2beSet  X

e

vevTe = I

“Decomposition  of  the  identity”  

X

e

hu, vei2 = kuk2

Simplification  of  Matrix  Sparsification  

1

(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I

ve = M�1/2beSet  X

e

vevTe = I

We  need    X

e

sevevTe ⇡✏ I

Simplification  of  Matrix  Sparsification  

1

(1 + ✏)I 4 M�1/2fMM�1/2 4 (1 + ✏)I

ve = M�1/2beSet  X

e

vevTe = I

X

e

sevevTe ⇡✏ I

Random  sampling  sets    

We  need    

pe = kvek2

What  happens  when  we  add  a  vector?  

Interlacing  

More  precisely  Characteristic  Polynomial:        

More  precisely  Characteristic  Polynomial:        Rank-­‐one  update:  

pA+vvT =⇣1 +

Pi<v,ui>

2

�i�x

⌘pA

Where          Aui = �iui

More  precisely  Characteristic  Polynomial:        Rank-­‐one  update:  

pA+vvT =⇣1 +

Pi<v,ui>

2

�i�x

⌘pA

are  zeros  of    

Physical  model  of  interlacing  λi  =  positive  unit  charges  resting  at  barriers  on  a  slope  

Physical  model  of  interlacing  

ui is  eigenvector  is  added  vector                    charge  on  barrier  

v

hv, uii2

Physical  model  of  interlacing  

Barriers  repel  eigs.  

ui is  eigenvector  is  added  vector                    charge  on  barrier  

v

hv, uii2

Physical  model  of  interlacing  

Barriers  repel  eigs.  

gravity  

Inverse  law  repulsion  

Examples  

Ex1:  All  weight  on  u1

v = u1

Ex1:  All  weight  on  u1

v = u1

Ex1:  All  weight  on  u1

v = u1

Ex1:  All  weight  on  u1

Pushed  up  against    next  barrier  

Gravity  keeps  resting      on  barrier  

v = u1

Ex2:  Equal  weight  on  u1, u2

Ex2:  Equal  weight  on  u1, u2

Ex2:  Equal  weight  on  u1, u2

Ex3:  Equal  weight  on  all  u1, u2 , …, un

+1/n

+1/n

+1/n

Ex3:  Equal  weight  on  all  u1, u2 , …, un

+1/n

+1/n

+1/n

Adding  a  random      ve

Because                  are  decomposition  of  identity,  ve

Ee

⇥PA+vevT

e

⇤=

1 +

1

m

X

i

1

�i � x

!PA

= PA � 1

m

d

dxPA

Ee

hhve, uii2

i= 1/m

Many  random      ve

Ee

⇥PA+vevT

e(x)

⇤= 1� 1

m

d

dx

PA(x)

Ee1,...,ek

hPve1v

Te1

+···+vekvTek(x)

i=

✓1� 1

m

d

dx

◆k

x

n

Many  random      ve

Ee

⇥PA+vevT

e(x)

⇤= 1� 1

m

d

dx

PA(x)

Ee1,...,ek

hPve1v

Te1

+···+vekvTek(x)

i=

✓1� 1

m

d

dx

◆k

x

n

Is  an  associated  Laguerre  polynomial!  

For                                        ,        roots  lie  between                                                              and        

k = n/✏2

(1� ✏)2n

m✏2(1 + ✏)2

n

m✏2

Matrix  Sparsification  Proof  Sketch  

Have  X

e

vevTe = I Want  

X

e

sevevTe ⇡✏ I

✏ ⇡ 2.6

Will  do  with   |{e : se 6= 0}| 6n

All  eigenvalues  between  1  and  13,  

Broad  outline:  moving  barriers  

0   n  -­‐n  

Step  1  

0   n  -­‐n  

Step  1  

0   n  -­‐n  

0   n  -­‐n  

Step  1  

0   n  -­‐n  

0  

+1/3   +2  

-­‐n+1/3   n+2  

Step  1  

0   n  -­‐n  

0  

+1/3   +2  

-­‐n+1/3   n+2  

tighter  constraint  

looser  constraint  

Step  i+1  

0  

Step  i+1  

0  

+1/3   +2  

Step  i+1  

0  

Step  i+1  

0  

+1/3   +2  

Step  i+1  

0  

Step  i+1  

0  

+1/3   +2  

Step  i+1  

0  

Step  i+1  

0  

Step  i+1  

0  

Step  i+1  

0

Step  6n  

0 … 13n n

Step  6n  

0 … n

2.6-approximation with 6n vectors.

13n

Problem  

need  to  show  that  an  appropriate      

always  exists.  

Problem  

need  to  show  that  an  appropriate      

always  exists.  

Is  not  strong  enough  for  induction  

Problems  

If  many  small  eigenvalues,  can  only  move  one  

Bunched  large  eigenvalues  repel  the  highest  one  

The  Lower  Barrier  Potential  Function  

�⇥(A) =P

i1

�i�⇥ = Tr�(A� �I)�1

The  Lower  Barrier  Potential  Function  

�⇥(A) =P

i1

�i�⇥ = Tr�(A� �I)�1

��(A) 1 =) �min(A) � ⇥+ 1

No  λi  within  dist.  1  No  2  λi  within  dist.  2  No  3  λi  within  dist.  3  

.  

.  No  k  λi  within  dist.  k  

The  Lower  Barrier  Potential  Function  

�⇥(A) =P

i1

�i�⇥ = Tr�(A� �I)�1

��(A) 1 =) �min(A) � ⇥+ 1

The  Upper  Barrier  Potential  Function  

�u(A) =P

i1

u��i= Tr

�(uI �A)�1

The  Beginning  

0 n-n

The  Beginning  

0 n-n

Step  i+1  

0

Step  i+1  

0

+1/3 +2

Lemma.    can  always  choose                so  that  potentials  do  not  increase  

+sevevTe

Step  i+1  

0

Step  i+1  

0

Step  i+1  

0

Step  i+1  

0

Step  6n  

0 … 13n n

Step  6n  

0 … n

2.6-approximation with 6n vectors.

13n

Goal  

   

+1/3 +2

Lemma.    can  always  choose                so  that  potentials  do  not  increase  

+sevevTe

+sevevTe

Upper  Barrier  Update  

Add                      and    set      

+2

svvT

Upper  Barrier  Update  

Add                      and    set      

+2

�u0(A+ svvT )

= Tr�(u0I �A� svvT )�1

svvT

Upper  Barrier  Update  

Add                      and    set      

+2

svvT

�u0(A+ svvT )

= Tr�(u0I �A� svvT )�1

= �u0(A) +

svT (u0I �A)�2v

1� svT (u0I �A)�1v

By  Sherman-­‐Morrison  Formula  

Upper  Barrier  Update  

Add                      and    set      

+2

svvT

�u0(A+ svvT )

= Tr�(u0I �A� svvT )�1

= �u0(A) +

svT (u0I �A)�2v

1� svT (u0I �A)�1v

Need   �u(A)

How  much  of                        can  we  add?  

Rearranging:  

�u0(A+ svvT ) �u(A)

1 � svT

✓(u0I �A)�2

�u(A)� �u0(A)+ (u0I �A)�1

◆v

iff  

vvT

How  much  of                        can  we  add?  

Rearranging:  

�u0(A+ svvT ) �u(A)

1 � svT

✓(u0I �A)�2

�u(A)� �u0(A)+ (u0I �A)�1

◆v

iff  

Write  as     1 � svTUAv

vvT

Lower  Barrier  

Similarly:  

iff  

Write  as    

�l0(A+ svvT ) �l(A)

1 svTLAv

1 svT

✓(A� l0I)�2

�l0(A)� �l(A)� (A� l0I)�1

◆v

Goal  

Show  that  we  can  always  add  some  vector  while  respecting  both  barriers.  

 +1/3   +2  

Need:     svTUAv 1 svTLAv

+svvT

Two  expectations  

Need:    

Can  show:    

svTUAv 1 svTLAv

Ee

⇥vTe UAve

⇤ 3/2m

Ee

⇥vTe LAve

⇤� 2/m

Two  expectations  

Need:    

Can  show:    

svTUAv 1 svTLAv

So:    

And,  exists      

Ee

⇥vTe UAve

⇤ 3/2m

Ee

⇥vTe LAve

⇤� 2/m

Ee

⇥vTe UAve

⇤ E

e

⇥vTe LAve

e : vTe UAve vTe LAve

Two  expectations  

Need:    

Can  show:    

svTUAv 1 svTLAv

So:    

And,  exists      

And          that  puts          between  them            s 1

Ee

⇥vTe UAve

⇤ 3/2m

Ee

⇥vTe LAve

⇤� 2/m

Ee

⇥vTe UAve

⇤ E

e

⇥vTe LAve

e : vTe UAve vTe LAve

Two  expectations  

Need:    

Can  show:    

svTUAv 1 svTLAv

So:    

And,  exists      

And          that  puts          between  them            s 1

Ee

⇥vTe UAve

⇤ 3/2m

Ee

⇥vTe LAve

⇤� 2/m

Ee

⇥vTe UAve

⇤ E

e

⇥vTe LAve

e : vTe UAve vTe LAve

Bounding  expectations  

vTUAv = Tr�UAvv

T�

Ee

⇥Tr

�UAvev

Te

� ⇤= Tr

⇣UA E

e

⇥vev

Te

⇤⌘

= Tr (UAI)

= Tr (UA)

/m

/m

Bounding  expectations  

Tr (UA) =Tr

�(u0I �A)�2

�u(A)� �u0(A)+ Tr

�(u0I �A)�1

= �u0(A)

�u(A)

1

As  barrier  function  is  monotone  decreasing

Bounding  expectations  

Tr (UA) =Tr

�(u0I �A)�2

�u(A)� �u0(A)+ Tr

�(u0I �A)�1

= �u0(A)

�u(A)

1

Numerator  is  derivative    of  barrier  function.    As  barrier  function  is  convex,  

1

u0 � u

Tr(UA) 1/2 + 1 = 3/2

Bounding  expectations  

Similarly,   Tr (LA) �1

l0 � l� 1

=1

1/3� 1

= 2

Step  i+1  

0

+1/3

Lemma.    can  always  choose                so  that  potentials  do  not  increase  

+2

Twice-­‐Ramanujan  Sparsifiers  

Fixing                      steps  and  tightening  parameters  gives  ratio  

         

Less  than  twice  as  many  edges  as  used  by  Ramanujan  Expander  of  same  quality  

�max

(A)

�min

(A) d+ 1 + 2

pd

d+ 1� 2pd

Open  Questions  

The  Ramanujan  bound    Properties  of  vectors  from  graphs?  

Faster  algorithm    union  of  random  Hamiltonian  cycles?