Sparse Approximations Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read...
-
Upload
howard-sanburn -
Category
Documents
-
view
215 -
download
0
Transcript of Sparse Approximations Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read...
Mathematically• Can an object with many pieces be
approximately represented by fewer pieces?
• Independent random sampling usually does well
• Theme of this talk: When can we beat random sampling?
Dense GraphSparse Graph
6 -1 -1 -1 -1 -1-1 4 -1 -1 -1-1 -1 6 -1 -1 -1
-1 5 -1 -1-1 -1 -1 7 -1 -1 -1-1 -1 -1 5-1 -1 -1 5 -1
-1 -1 -1 -1 6
6 -15 -1 -3
-1 28
-1 2 -11
-3 -1 52
Dense MatrixSparse Matrix
Talk Outline
• Vignette #1: Discrepancy theory
• Vignette #2: Singular values and eigenvalues
• Vignette #3: Graphs
• Theorem on “Spectrally Thin Trees”
Discrepancy• Given vectors v1,…,vn2Rd with kvikp bounded.
Want y2{-1,1}n with ki yivikq small.
• Eg1: If kvik1·1 then E ki yi vik1 ·
• Eg2: If kvik1·1 then 9y s.t. ki yi vik1 ·
Spencer ‘85: Partial Coloring + Entropy MethodGluskin ‘89: Sidak’s LemmaGiannopoulos ‘97: Partial Coloring + Sidak
Bansal ‘10: Brownian Motion + Semidefinite ProgramBansal-Spencer ‘11: Brownian Motion + Potential functionLovett-Meka ‘12: Brownian Motion
Non-algorithmic
Algorithmic
Discrepancy• Given vectors v1,…,vn2Rd with kvikp
bounded.Want y2{-1,1}n with ki yivikq small.
• Eg1: If kvik1·1 then E ki yi vik1 ·
• Eg2: If kvik1·1 then 9y s.t. ki yi vik1 ·
• Eg3: If kvik1·¯, kvik1·±, and ki vik1·1, then9y with ki yi vik1 ·
Harvey ’13: Using Lovasz Local Lemma.Question: Can log(±/¯2) factor be improved?
Talk Outline
• Vignette #1: Discrepancy theory
• Vignette #2: Singular values and eigenvalues
• Vignette #3: Graphs
• Theorem on “Spectrally Thin Trees”
Partitioning sums of rank-1 matrices
• Let v1,…,vn2Rd satisfy i viviT=I and
kvik2·±.Want y2{-1,1}n with ki yivivi
Tk2 small.
• Random sampling: E ki yiviviTk2 ·
.Rudelson ’96: Proofs using majorizing measures, then nc-Khintchine
• Marcus-Spielman-Srivastava ’13:9y2{-1,1}n with ki yivivi
Tk2 · .
2
Partitioning sums of matrices
• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.
• Random sampling: E ki yiMik2 ·
Also follows from nc-Khintchine.Ahlswede-Winter ’02: Using matrix moment generating function.Tropp ‘12: Using matrix cumulant generating function.
Partitioning sums of matrices
• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.
• Random sampling: E ki yiMik2 ·
• Question: 9y2{-1,1}n with ki yiMik2 · ?
• Conjecture: Suppose i Mi=I and kMikSch-1·±.9y2{-1,1}n with ki yiMik2 · ?–MSS ’13: Rank-one case is true– Harvey ’13: Diagonal case is true (ignoring
log(¢) factor)
False!
Partitioning sums of matrices
• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.
• Random sampling: E ki yiMik2 ·
• Question: Suppose only that kMik2·1. 9y2{-1,1}n with ki yiMik2 · ?– Spencer/Gluskin: Diagonal case is true
Column-subset selection• Given vectors v1,…,vn2Rd with kvik2=1.
Let st.rank=n/ki viviTk2. Let .
9y2{0,1}n s.t. i yi=k and (1-²)2 · ¸k( i
yiviviT
).
Spielman-Srivastava ’09: Potential function argumentYoussef ’12: Let . 9y2{0,1}n
s.t. i yi=k, (1-²)2
· ¸k( i yiviviT
) and ¸1( i yiviviT
) · (1+²)2.
Column-subset selectionup to the stable rank
• Given vectors v1,…,vn2Rd with kvik2=1.Let st.rank=n/ki vivi
Tk2. Let .For y2{0,1}n s.t. i yi=k, can we control ¸k( i yivivi
T ) and ¸1( i yivivi
T ) ?
– ¸k can be very small, say O(1/d).
– Rudelson’s theorem: can get ¸1 · O(log d) and ¸k>0.
– Harvey-Olver ’13: ¸1 · O(log d / log log d) and ¸k>0.
–MSS ‘13: If i viviT
=I, can get ¸1 · O(1) and ¸k>0.
Talk Outline
• Vignette #1: Discrepancy theory
• Vignette #2: Singular values and eigenvalues
• Vignette #3: Graphs
• Theorem on “Spectrally Thin Trees”
Graph Laplacian
Lu = D-A =
7 -2 -5
-2 3 -1
-5 -1 16 -10
-10
10
a
b
c
d
a b c d
weighted degree of node
c
negative of u(ac)
Graph with weights u: 5 10
2 1
Laplacian Matrix:
a
b
dc
Effective Resistance from s to t: voltage difference when each edge e is a (1/ue)-ohm resistor and a 1-amp current source placed between s and t
= (es-et)T Lu
y (es-et)
Effective Conductance: cst = 1 / (effective resistance from
s to t)
Spectral approximation of graphs
®-spectral sparsifier: Lu ¹ Lw ¹ ®¢Lu
5 -1 -1 -1 -1 -14 -1 -1 -1 -1
-1 -1 6 -1 -1 -1 -1-1 5 -1 -1 -1 -1
-1 -1 -1 7 -1 -1 -1 -1-1 -1 -1 5 -1 -1-1 -1 -1 5 -1 -1
-1 -1 -1 -1 6 -1 -1-1 -1 -1 -1 -1 5
-1 -1 -1 -1 -1 -1 6
6 -1 -55 -1 -3 -1
-1 2 -18 -8
-1 2 -11 -1
-3 -1 5 -12 -1 -1
-5 -1 -1 -1 8-1 -8 -1 10
Edge weights u
Edge weights w
Lu = Lw =
Ramanujan Graphs• Suppose Lu is complete graph on n vertices
(ue=1 8e).
• Lubotzky-Phillips-Sarnak ’86:For infinitely many d and n, 9w2{0,1}E such that e we=dn/2 (actually Lw is d-regular)
and
• MSS ‘13: Holds for all d¸3, and all n=c¢2k.
• Friedman ‘04: If Lw is a random d-regular graph, then 8²>0
with high probability.
Arbitrary graphs• Spielman-Srivastava ’08: For any graph
Lu with n=|V|, 9w2RE such that |support(w)| = O(n log(n)/²2)
andProof: Follows from Rudelson’s theorem
• MSS ’13: For any graph Lu with n=|V|,9w2RE such that we 2 £(²2) ¢ N ¢ (effective
conductance of e) |support(w)| = O(n/²2)
and
Spectrally-thin trees• Question: Let G be an unweighted graph with n
vertices. Let C = mine (effective conductance of edge e).Want a subtree T of G with .
• Equivalent to
• Goddyn’s Conjecture ‘85: There is a subtree T with
– Relates to conjectures of Tutte (‘54) on nowhere-zero flows,and to approximations of the traveling salesman problem.
Spectrally-thin trees• Question: Let G be an unweighted graph with
n vertices. Let C = mine (effective conductance of edge e).Want a subtree T of G with .
• Rudelson’s theorem: Easily gives ® = O(log n).
• Harvey-Olver ‘13: ® = O(log n / log log n).Moreover, there is an efficient algorithm to find such a tree.
• MSS ’13: ® = O(1), but not algorithmic.
Talk Outline
• Vignette #1: Discrepancy theory
• Vignette #2: Singular values and eigenvalues
• Vignette #3: Graphs
• Theorem on “Spectrally Thin Trees”
Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with
Spectrally Thin Trees
Proof overview:1. Show independent sampling gives
spectral thinness, but not a tree.
► Sample every edge e independently with prob. xe=1/ce
2. Show dependent sampling gives a tree, and spectral thinness still works.
Matrix Concentration
Theorem: [Tropp ‘12]Let Y1,…,Ym be independent, PSD matrices of size nxn.Let Y=i Yi and Z=E [ Y ]. Suppose Yi ¹ R¢Z a.s. Then
Define sampling probabilities xe = 1/ce. It is known that e
xe = n–1.
Claim: Independent sampling gives T µ E with E [|T|]=n–1 and
Theorem [Tropp ‘12]: Let M1,…,Mm be nxn PSD matrices.
Let D(x) be a product distribution on {0,1}m with marginals x.Let Suppose Mi ¹ Z.
Then
Define Me = ce¢Le. Then Z = LG and Me ¹ Z holds.
Setting ®=6 log n / log log n, we get whp.
But T is not a tree!
Independent sampling
Laplacian of the single edge eProperties of conductances used
Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with
Spectrally Thin Trees
Proof overview:1. Show independent sampling gives spectral thinness,
but not a tree.
► Sample every edge e independently with prob. xe=1/ce
2. Show dependent sampling gives a tree, and spectral thinness still works.
► Run pipage rounding to get tree T with Pr[ e2T ] = xe =
1/ce
Pipage rounding[Ageev-Svirideno ‘04, Srinivasan ‘01, Calinescu et al. ‘07, Chekuri et al. ‘09]
Let P be any matroid polytope.E.g., convex hull of characteristic vectors of spanning trees.
Given fractional xFind coordinates a and b s.t. linez x + z ( ea – eb ) stays in current face
Find two points where line leaves P
Randomly choose one of thosepoints s.t. expectation is x
Repeat until x = ÂT is integral
x is a martingale: expectation of final ÂT is original fractional x.
ÂT1
ÂT2
ÂT3
ÂT4
ÂT5
ÂT6
x
Say f : Rm ! R is concave under swaps if z ! f( x + z(ea-eb) ) is concave 8x2P, 8a, b2[m].
Let X0 be initial point and ÂT be final point visited by pipage rounding.
Claim: If f concave under swaps then E[f(ÂT)] · f(X0). [Jensen]
Let E µ {0,1}m be an event.Let g : [0,1]m ! R be a pessimistic estimator for E, i.e.,
Claim: Suppose g is concave under swaps. Then Pr[ ÂT 2 E ] · g(X0).
Pipage rounding and concavity
Chernoff BoundChernoff Bound: Fix any w, x 2 [0,1]m and let ¹ =
wTx.
Define . Then,
Claim: gt,µ is concave under swaps.
[Elementary calculus]
Let X0 be initial point and ÂT be final point visited by pipage rounding.Let ¹ = wTX0. Then
Bound achieved by independent sampling also achieved by pipage rounding
Matrix Pessimistic Estimators
Main Theorem: gt,µ is concave under swaps.
Theorem [Tropp ‘12]: Let M1,…,Mm be nxn PSD
matrices.
Let D(x) be a product distribution on {0,1}m with marginals x.Let Suppose Mi ¹ Z.
Let
Then
and .
Bound achieved by independent sampling also achieved by pipage rounding
Pessimistic estimator
Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with
Spectrally Thin Trees
Proof overview:1. Show independent sampling gives spectral thinness,
but not a tree.
► Sample every edge e independently with prob. xe=1/ce
2. Show dependent sampling gives a tree, and spectral thinness still works.
► Run pipage rounding to get tree T with Pr[ e2T ] = xe =
1/ce
Matrix AnalysisMatrix concentration inequalities are usually proven via sophisticated inequalities in matrix analysisRudelson: non-commutative Khinchine inequalityAhlswede-Winter: Golden-Thompson inequalityif A, B symmetric, then tr(eA+B) · tr(eA
eB).Tropp: Lieb’s concavity inequality [1973]if A, B Hermitian and C is PD, then z ! tr exp( A + log(C+zB) ) is concave.
Key technical result: new variant of Lieb’s theoremif A Hermitian, B1, B2 are PSD, and C1, C2 are PD,
then z ! tr exp( A + log(C1+zB1) + log(C2–zB2) ) is
concave.