Graph Sparsifiers : A Survey
description
Transcript of Graph Sparsifiers : A Survey
Graph Sparsifiers: A SurveyNick Harvey
Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman, Srivastava and Teng
Approximating Dense Objects by Sparse Ones
• Floor joists
• Image compression
Approximating Dense Graphsby Sparse Ones
• Spanners: Approximate distances to within ®using only = O(n1+2/®) edges
• Low-stretch trees: Approximate most distancesto within O(log n) using only n-1 edges
(n = # vertices)
Overview
• Definitions– Cut & Spectral Sparsifiers
• Cut Sparsifiers– A combinatorial construction
• Spectral Sparsifiers– A random sampling construction– Derandomization
Cut Sparsifiers• Input: An undirected graph G=(V,E) with
weights u : E ! R+
• Output: A subgraph H=(V,F) of G with weightsw : F ! R+ such that |F| is small and w(±H(U)) = (1 § ²) u(±G(U)) 8U µ V
weight of edges between U and V\U in Gweight of edges between U and V\U in H
UU
(Karger ‘94)
Cut Sparsifiers• Input: An undirected graph G=(V,E) with
weights u : E ! R+
• Output: A subgraph H=(V,F) of G with weightsw : F ! R such that |F| is small and w(±H(U)) = (1 § ²) u(±G(U)) 8U µ V
weight of edges between U and V\U in Gweight of edges between U and V\U in H
Generic Applicationof Cut Sparsifiers
(Dense) Input graph G Exact/Approx Output
(Slow) Algorithm A for some problem P
Sparse graph H approx preserving solution of P
Algorithm A(now faster) Approximate
Output
(Efficient) Sparsification Algorithm S
Min s-t cut, Sparsest cut,Max cut, …
Relation to Expander Graphs• Graph H on V is an expander if, for some constant c,
|±H(U)| ¸ c ¢ |U| 8U µ V, |U|·n/2• Let G be the complete graph on V. Note that
|±G(U)| = |U|¢|VnU| · n¢|U|• If we give all edges of H weight w=n, then
w(±H(U)) ¸ c ¢|±G(U)| 8U µ V, |U|·n/2• Expanders are similar to sparsifiers of complete graph
HG
Relation to Expander Graphs• Fact: Pick a random graph where each edge appears
independently with probability p=(log(n)/n). Gives an expander with O(n log n) edges with high probability.
HG
Spectral Sparsifiers• Input: An undirected graph G=(V,E) with
weights u : E ! R+
• Def: The Laplacian is the matrix LG such that xT LG x = st2E ust (xs-xt)2 8x2RV.
• LG is positive semidefinite since this is ¸ 0.• Example: Electrical Networks–View edge st as resistor of resistance 1/ust.– Impose voltage xv at every vertex v.–Ohm’s Power Law: P = V2/R.– Power consumed on edge st is ust (xs-xt)2.– Total power consumed is xT LG x.
(Spielman-Teng ‘04)
Spectral Sparsifiers• Input: An undirected graph G=(V,E) with
weights u : E ! R+
• Def: The Laplacian is the matrix LG such that xT LG x = st2E ust (xs-xt)2 8x2RV.
• Output: A subgraph H=(V,F) of G with weightsw : F ! R such that |F| is small and
xT LH x = (1 § ²) xT LG x 8x 2 RV
w(±H(U)) = (1 § ²) u(±G(U)) 8U µ V
SpectralSparsifier
CutSparsifier
) )(Spielman-Teng ‘04)
Cut vs Spectral Sparsifiers• Number of Constraints:– Cut: w(±H(U)) = (1§²) u(±G(U)) 8UµV (2n constraints)
– Spectral: xTLHx = (1§²) xTLGx 8x2RV (1 constraints)
• Spectral constraints are SDP feasibility constraints: (1-²) xT LG x · xT LH x · (1+²) xT LG x 8x2RV
, (1-²) LG ¹ LH ¹ (1+²) LG
• Spectral constraints are actually easier to handle– Checking “Is H is a spectral sparsifier of G?” is in P– Checking “Is H is a cut sparsifier of G?” is
non-uniform sparsest cut, so NP-hard
Here X ¹ Y means Y-X is positive semidefinite
Application of Spectral Sparsifiers• Consider the linear system LG x = b.
Actual solution is x := LG-1 b.
• Instead, compute y := LH-1 b,
where H is a spectral sparsifier of G.
• We know: (1-²) LG ¹ LH ¹ (1+²) LG
) y has low multiplicative error: ky-xkLG · 2² kxkLG
• Computing y is fast since H is sparse:conjugate gradient method takes O(n|F|) time (where |F| = # nonzero entries of LH)
Application of Spectral Sparsifiers• Consider the linear system LG x = b.
Actual solution is x := LG-1 b.
• Instead, compute y := LH-1 b,
where H is a spectral sparsifier of G.
• We know: (1-²) LG ¹ LH ¹ (1+²) LG
) y has low multiplicative error: ky-xkLG · 2² kxkLG
• Theorem: [Spielman-Teng ‘04, Koutis-Miller-Peng ‘10]Can compute a vector y with low multiplicative error in O(m log n (log log n)2) time. (m = # edges of G)
Results on SparsifiersCut Sparsifiers Spectral Sparsifiers
Combinatorial
Linear Algebraic
Karger ‘94
Benczur-Karger ‘96
Fung-Hariharan-Harvey-Panigrahi ‘11
Spielman-Teng ‘04
Spielman-Srivastava ‘08
Batson-Spielman-Srivastava ‘09de Carli Silva-Harvey-Sato ‘11
These construct sparsifiers with n logO(1) n / ²2 edges
These construct sparsifiers with O(n / ²2) edges
Sparsifiers by Random Sampling
• The complete graph is easy!Random sampling gives an expander (ie. sparsifier) with O(n log n) edges.
Sparsifiers by Random Sampling
• Can’t sample edges with same probability!• Idea [BK’96]
Sample low-connectivity edges with high probability, and high-connectivity edges with low probability
Keep this
Eliminate most of these
Non-uniform sampling algorithm [BK’96]
• Input: Graph G=(V,E), weights u : E ! R+
• Output: A subgraph H=(V,F) with weights w : F ! R+
Choose parameter ½Compute probabilities { pe : e2E }For i=1 to ½
For each edge e2EWith probability pe,
Add e to F Increase we by ue/(½pe)
• Note: E[|F|] · ½ ¢ e pe
• Note: E[ we ] = ue 8e2E ) For every UµV, E[ w(±H(U)) ] = u(±G(U))
Can we dothis so that thecut values aretightly concentratedand E[|F|]=n logO(1) n?
Benczur-Karger ‘96• Input: Graph G=(V,E), weights u : E ! R+
• Output: A subgraph H=(V,F) with weights w : F ! R+
Choose parameter ½Compute probabilities { pe : e2E }For i=1 to ½
For each edge e2EWith probability pe,
Add e to F Increase we by ue/(½pe)
Can we dothis so that thecut values aretightly concentratedand E[|F|]=n logO(1) n?
• Set ½ = O(log n/²2).• Let pe = 1/“strength” of edge e.• Cuts are preserved to within (1 § ²) and E[|F|] = O(n log n/²2)
Can approximateall values inm logO(1) n time.
Fung-Hariharan-Harvey-Panigrahi ‘11• Input: Graph G=(V,E), weights u : E ! R+
• Output: A subgraph H=(V,F) with weights w : F ! R+
Choose parameter ½Compute probabilities { pe : e2E }For i=1 to ½
For each edge e2EWith probability pe,
Add e to F Increase we by ue/(½pe)
Can we dothis so that thecut values aretightly concentratedand E[|F|]=n logO(1) n?
• Set ½ = O(log2 n/²2).• Let pst = 1/(min cut separating s and t)• Cuts are preserved to within (1 § ²) and E[|F|] = O(n log2 n/²2)
Can approximateall values inO(m + n log n) time
• Let kuv = min size of a cut separating u and v.Recall sampling probability is pe = 1/ke
• Partition edges into connectivity classesE = E1 [ E2 [ ... Elog n where Ei = { e : 2i-1·ke<2i }
• Prove weight of sampled edges that each cuttakes from each connectivity class has low error
• Key point: Edges in ±(U)ÅEi have roughly same pe
• This yields a sparsifier
U
Prove weight of sampled edges that each cuttakes from each connectivity class has low error
• Notation:• C = ±(U) is a cut • Ci = ±(U) Å Ei is a cut-induced set
• Need to prove: for every Ci
C1 C2 C3 C4
• Notation: Ci = ±(U) Å Ei is a cut-induced set
C1 C2 C3 C4
Prove 8 cut-induced set Ci
• Key Ingredients• Hoeffding bound: Prove small• Bound on # small cut-induced sets:
For most of these events, u(C) is large.In other words, #{ cut-induced sets Ci induced by a small cut C }is small.
Counting Small Cut-Induced Sets• Theorem: [Fung-Hariharan-Harvey-Panigrahi ‘11]
Let G=(V,E) be a graph. Fix any BµE.Suppose ke¸K for all e in B. (kuv = min size of a cut separating u and v)
Then, for every ®¸1,|{ ±(U) Å B : |±(U)|·®K }| < n2®.
• Corollary: [Karger ‘93]Let G=(V,E) be a graph.Let K be the edge-connectivity of G. (i.e., global min cut value)
Then, for every ®¸1,|{ ±(U) : |±(U)|·®K }| < n2®.
Summary for Cut Sparsifiers• Do non-uniform sampling of edges,
with probabilities based on “connectivity”• Analysis involves:– Decomposing the graph– Hoeffding bounds to analyze each “cut”– Cut-counting theorem: “few small cuts”
• BK’96 had weaker cut-counting theorem, but had more complicated “connectivity” notion.
• Can get sparsifiers with O(n log n / ²2) edges– Optimal for any independent sampling algorithm
Spectral Sparsification• Input: Graph G=(V,E), weights u : E ! R+
• Recall: xT LG x = st2E ust (xs-xt)2
• Goal: Find weights w : E ! R+ such that very fewwe are non-zero, and
(1-²) xT LG x · e2E we xT Le x · (1+²) xT LG x 8x2RV
, (1- ²) LG ¹ e2E we Le ¹ (1+²) LG
• General Problem: Given matrices Le satisfying e Le = LG, find coefficients we, mostly zero, such that (1-²) LG ¹ e we Le ¹ (1+²) LG
Call this xT Lst x
The General Problem:Sparsifying Sums of PSD Matrices
• General Problem: Given PSD matrices Le s.t. e Le = LG, find coefficients we, mostly zero, such that (1-²) LG ¹ e we Le ¹ (1+²) LG
• Theorem: [Ahlswede-Winter ’02]Randomized alg gives w with O( n log n/²2 ) non-zeros.
• Theorem: [de Carli Silva-Harvey-Sato ‘11],building on [Batson-Spielman-Srivastava ‘09]Deterministic alg gives w with O( n/²2 ) non-zeros.– Cut & spectral sparsifiers with O(n/²2) edges [BSS’09]– Sparsifiers with more properties and O(n/²2) edges [dHS’11]
Vector Case• General Problem: Given PSD matrices Le s.t. e Le = L, find
coefficients we, mostly zero, such that (1-²) L ¹ e we Le ¹ (1+²) L
Vector Case• Vector problem: Given vectors ve2[0,1]n s.t. e ve = v,
find coefficients we, mostly zero, such that k e we ve - v k1 · ²
• Theorem [Althofer ‘94, Lipton-Young ‘94]:There is a w with O(log n/²2) non-zeros.
• Proof: Random sampling & Hoeffding inequality.• Multiplicative version: There is a w with O(n log n/²2)
non-zeros such that (1-²) v · e we ve · (1+²) v
Concentration Inequalities• Theorem: [Chernoff ‘52, Hoeffding ‘63]
Let Y1,…,Yk be i.i.d. random non-negative real numbers s.t. E[ Yi ] = Z and Yi·uZ. Then
• Theorem: [Ahlswede-Winter ‘02]Let Y1,…,Yk be i.i.d. random PSD nxn matricess.t. E[ Yi ] = Z and Yi¹uZ. Then
The only difference
“Balls & Bins” Example• Problem: Throw k balls into n bins. Want
max load / min load · 1+². How big should k be?• AW Theorem: Let Y1,…,Yk be i.i.d. random PSD matrices
such that E[ Yi ] = Z and Yi¹uZ. Then
• Solution: Let Yi be all zeros, except for a single n in a random diagonal entry.
Then E[ Yi ] = I =: Z, and ¸max(Yi Z-1) = n =: u.Set k = £(n log n /²2). Then, with high probability,every diagonal entry of i Yi/k is in [1-²,1+²].
Solving the General Problem• General Problem: Given PSD matrices Le s.t. e Le = LG,
find coefficients we, mostly zero, such that (1-²) LG ¹ e we Le ¹ (1+²) LG
• AW Theorem: Let Y1,…,Yk be i.i.d. random PSD matricessuch that E[ Yi ] = Z and Yi¹uZ. Then
• Solve General Problem with O(n log n/²2) non-zeros• Repeat k:=£(n log n /²2) times• Pick an edge e with probability pe := Tr(Le LG
-1) / n• Increment we by 1/k¢pe
Derandomization• Vector problem: Given vectors ve2[0,1]n s.t. e ve = v,
find coefficients we, mostly zero, such that k e we ve - v k1 · ²
• Theorem [Young ‘94]: The multiplicative weights method deterministically gives w with O(log n/²2) non-zeros– Or, use pessimistic estimators on the Hoeffding proof
• General Problem: Given PSD matrices Le s.t. e Le = LG, find coefficients we, mostly zero, such that (1-²) LG ¹ e we Le ¹ (1+²) LG
• Theorem [de Carli Silva-Harvey-Sato ‘11]:The matrix multiplicative weights method (Arora-Kale ‘07)deterministically gives w with O(n log n/²2) non-zeros– Or, use matrix pessimistic estimators (Wigderson-Xiao ‘06)
MWUM for “Balls & Bins”
0 1¸ values:
l u
• Let ¸i = load in bin i. Initially ¸=0. Want: 1·¸i and ¸i · 1.• Introduce penalty functions “exp(l-¸i)” and “exp(¸i-u)”• Find a bin ¸i to throw a ball into such that,
increasing l by ±l and u by ±u, the penalties don’t grow. i exp(l+±l - ¸i’) · i exp(l -¸i) i exp(¸i’-(u+±u)) · i exp(¸i-u)
• Careful analysis shows O(n log n/²2) balls is enough
MMWUM for General Problem
0 1¸ values:
l u
• Let A=0 and ¸ its eigenvalues. Want: 1·¸i and ¸i · 1.• Use penalty functions Tr exp(lI-A) and Tr exp(A-uI)• Find a matrix Le such that adding ®Le to A,
increasing l by ±l and u by ±u, the penalties don’t grow. Tr exp((l+±l)I- (A+®Le)) · Tr exp(l I-A) Tr exp((A+®Le)-(u+±u)I) · Tr exp(A-uI)
• Careful analysis shows O(n log n/²2) matrices is enough
Beating Sampling & MMWUM
0 1¸ values:
l u
• To get a better bound, try changing the penalty functions to be steeper!
• Use penalty functions Tr (A-lI)-1 and Tr (uI-A)-1
• Find a matrix Le such that adding ®Le to A,increasing l by ±l and u by ±u, the penalties don’t grow. Tr ((A+®Le)-(l+±l)I)-1 · Tr (A-l I)-1
Tr ((u+±u)I-(A+®Le))-1 · Tr (uI-A)-1
All eigenvaluesstay within [l, u]
Beating Sampling & MMWUM• To get a better bound, try changing the penalty functions to be
steeper!• Use penalty functions Tr (A-lI)-1 and Tr (uI-A)-1
• Find a matrix Le such that adding ®Le to A,increasing l by ±l and u by ±u, the penalties don’t grow. Tr ((A+®Le)-(l+±l)I)-1 · Tr (A-l I)-1
Tr ((u+±u)I-(A+®Le))-1 · Tr (uI-A)-1
• General Problem: Given PSD matrices Le s.t. e Le = LG, find coefficients we, mostly zero, such that (1-²) LG ¹ e we Le ¹ (1+²) LG
• Theorem: [Batson-Spielman-Srivastava ‘09] in rank-1 case,[de Carli Silva-Harvey-Sato ‘11] for general caseThis gives a solution w with O( n/²2 ) non-zeros.
Applications• Theorem: [de Carli Silva-Harvey-Sato ‘11]
Given PSD matrices Le s.t. e Le = L, there is analgorithm to find w with O( n/²2 ) non-zeros such that (1-²) L ¹ e we Le ¹ (1+²) L
• Application 1: Spectral Sparsifiers with CostsGiven costs on edges of G, can find sparsifier H whose cost isat most (1+²) the cost of G.
• Application 2: Simultaneous Spectral SparsifiersGiven two graphs G1 & G2 with a bijection on their edges,can choose edges that simultaneously sparsify G1 & G2.
• Application 3: Sparse SDP Solutions min { cTy : i yiAi º B, y¸0 } where Ai’s and B are PSDhas nearly optimal solution with O(n/²2) non-zeros.
Open Questions
• Use of sparsifiers in other areas (infoviz, etc.)• Sparsifiers for directed graphs• Construction of expander graphs• More control of the weights we
• A combinatorial proof of spectral sparsifiers• More applications of our general theorem