Matrix Martingales in Randomized Numerical Linear Algebra
Transcript of Matrix Martingales in Randomized Numerical Linear Algebra
Matrix Martingales in Randomized Numerical Linear Algebra
Speaker Rasmus Kyng Harvard University
Sep 27, 2018
Concentration of Scalar Random Variables
Random ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 0
Is ! ≈ 0 with high probability?
Concentration of Scalar Random Variables
Bernstein’s InequalityRandom ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 03. !$ ≤ +4. ∑$ (!$, ≤ -,
gives
ℙ[ ! > 1] ≤ 2exp − 1,/2+1 + -, ≤ :
E.g. if 1 = 0.5, and +, -, = 0.1/log(1/:)
Concentration of Scalar Martingales
Random ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 0 ( !$|!+, … , !$-+ = 0
Concentration of Scalar Martingales
Random ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 0 ( !$| previous steps = 0
Concentration of Scalar MartingalesFreedman’s InequalityBernstein’s InequalityRandom ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 03. !$ ≤ +4. ∑$ (!$, ≤ -,
gives
ℙ[ ! > 1] ≤ 2 exp − 1,/2+1 + -,
( !$| previous steps = 0
∑$ ( !$,| previous steps ≤ -,
+B
ℙ ∑$ ( !$,| previous steps > -, ≤ B
Concentration of Scalar Martingales
Freedman’s InequalityRandom ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. ( !$| previous steps = 03. !$ ≤ 54. ℙ[∑$ ( !$8| previous steps > :8] ≤ <
gives
ℙ[ ! > =] ≤ 2exp − =8/25= + :8 + <
Concentration of Matrix Random Variables
Matrix Bernstein’s Inequality (Tropp 11)Random ! = ∑$ !$ !$ ∈ ℝ'×', symmetric1. !$ are independent2. *!$ = +3. !$ ≤ /4. ∑$ *!$0 ≤ 10
gives
ℙ[ ! > 5] ≤ 7 2exp − =>/0@=AB>
Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11)Random ! = ∑$ !$ !$ ∈ ℝ'×', symmetric1. !$ are independent2. *!$ = 0 * !$| previous steps = 03. !$ ≤ 94. ∑$ *!$: ≤ ;: ℙ[ ∑$ * !$:| prev. steps > ;:] ≤ @
gives
ℙ[ ! > A] ≤ B 2exp − FG/:IFJKG
+ @
E.g. if A = 0.5, and 9, ;: = 0.1/log(B/R)
≤ R + @
Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11)
ℙ[ ∑$ % &$'| prev. steps > 1'] ≤ 4
Predictable quadratic variation =6
$% &$'| prev. steps
Laplacian Matrices
! "Graph # = (&, ()Edge weights *:( → ℝ./ = &0 = |(|
2
/×/ matrix 4 = 56∈8
46
Laplacian Matrices
! "Graph # = (&, ()Edge weights *:( → ℝ./ = &0 = |(|
2
3 = 45∈7
35
L(a,b) = w(a,b)
. . . a . . . b . . .�
�������
�
�������
...a 1 �1...b �1 1...
/×/ matrix
w(a,b)
Laplacian Matrices
!: weighted adjacency matrix of the graph
#: diagonal matrix of weighted degrees
$ = # − !
!'( = )'(
#'' = ∑( )'(
Graph + = (-, /)Edge weights ):/ → ℝ34 = -5 = |/|
Laplacian Matrices
Symmetric matrix !
All off-diagonals are non-positive and
!"" =$%&"
!"%
Laplacian of a Graph
�
�1 �1 0
�1 1 00 0 0
�
�
�
�0 0 00 1 �10 �1 1
�
�2�
�2 0 �20 0 0
�2 0 2
�
�
11
�
�3 �1 �2
�1 2 �1�2 �1 3
�
�
Laplacian Matrices
[ST04]: solving Laplacian linear equations in !" # time
⋮
[KS16]: simple algorithm
Solving a Laplacian Linear Equation
!" = $
Gaussian EliminationFind %, upper triangular matrix, s.t.
%&% = !Then
" = %'(%'&$
Easy to apply %'( and %'&
Solving a Laplacian Linear Equation
!" = $
Approximate Gaussian EliminationFind %, upper triangular matrix, s.t.
%&% ≈ !
% is sparse.
( log ,- iterations to get.-approximate solution /".
Approximate Gaussian Elimination
Theorem [KS]When ! is an Laplacian matrix with " non-zeros,we can find in #(" log( )) time an upper triangular matrix + with #(" log( )) non-zeros,s.t. w.h.p.
+,+ ≈ !
Additive View of Gaussian Elimination
Find !, upper triangular matrix, s.t !"! = $
0
BB@
16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7
1
CCA$ =
�
���
16 �4 �8 �4�4 1 2 1�8 2 4 2�4 1 2 1
�
���
�
���
16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7
�
���
�
���
16 �4 �8 �4�4 1 2 1�8 2 4 2�4 1 2 1
�
��� =
�
���
4�1�2�1
�
���
�
���
4�1�2�1
�
���
�
Find the rank-1 matrix that agrees with !on the first row and column.
Additive View of Gaussian Elimination
�
���
16 �4 �8 �4�4 1 2 1�8 2 4 2�4 1 2 1
�
���
�
���
16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7
�
��� �
�
���
0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6
�
���
Subtract the rank 1 matrix.We have eliminated the first variable.
=
Additive View of Gaussian Elimination
The remaining matrix is PSD.
�
���
0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6
�
���
Additive View of Gaussian Elimination
�
���
0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6
�
���
Find rank-1 matrix that agrees withour matrix on the next row and column.
�
���
0 0 0 00 4 �2 �20 �2 1 10 �2 1 1
�
��� =
�
���
02
�1�1
�
���
�
���
02
�1�1
�
���
�
Additive View of Gaussian Elimination
Subtract the rank 1 matrix.We have eliminated the second variable.
�
���
0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6
�
���
�
���
0 0 0 00 4 �2 �20 �2 1 10 �2 1 1
�
���
�
�
���
0 0 0 00 0 0 00 0 9 �30 0 �3 5
�
���=
Additive View of Gaussian Elimination
0
BB@
16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7
1
CCA! =
Repeat until all parts written as rank 1 terms.
=
�
���
4�1�2�1
�
���
�
���
4�1�2�1
�
���
�
+
�
���
02
�1�1
�
���
�
���
02
�1�1
�
���
�
+
�
���
003
�1
�
���
�
���
003
�1
�
���
�
+
�
���
0002
�
���
�
���
0002
�
���
�
Additive View of Gaussian Elimination
0
BB@
16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7
1
CCA
=
�
���
4 0 0 0�1 2 0 0�2 �1 3 0�1 �1 �1 2
�
���
�
���
4 �1 �2 �10 2 �1 �10 0 3 �10 0 0 2
�
���
Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms.
! =
0
BB@
16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7
1
CCA
=
�
���
4 �1 �2 �10 2 �1 �10 0 3 �10 0 0 2
�
���
� �
���
4 �1 �2 �10 2 �1 �10 0 3 �10 0 0 2
�
��� = "# "
Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms.
$ =
Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians?
The remaining matrix is always Laplacian.
L =
�
���
16 �8 �4 �4�8 8 0 0�4 0 4 0�4 0 0 4
�
���! =
Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians?
The remaining matrix is always Laplacian.
L =
�
���
16 �8 �4 �4�8 4 2 2�4 2 1 1�4 2 1 1
�
��� +
�
���
0 0 0 00 4 �2 �20 �2 3 �10 �2 �1 3
�
���! =
Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians?
The remaining matrix is always Laplacian.
+
�
���
0 0 0 00 4 �2 �20 �2 3 �10 �2 �1 3
�
���
A new Laplacian!
4 �2 �2�2 3 �1�2 �1 3
L =
�
���
4�2�1�1
�
���
�
���
4�2�1�1
�
���
�
! =
Why is Gaussian Elimination Slow?
Solving !" = $ by Gaussian Elimination can take Ω &' time.
The main issue is fill
(! =
Why is Gaussian Elimination Slow?
Solving !" = $ by Gaussian Elimination can take Ω &' time.
The main issue is fill
(! =
Solving !" = $ by Gaussian Elimination can take Ω &' time.
The main issue is fill
Why is Gaussian Elimination Slow?
New Laplacian(
Elimination creates a cliqueon the neighbors of (
! =
Solving !" = $ by Gaussian Elimination can take Ω &' time.
The main issue is fill
Why is Gaussian Elimination Slow?
(
Laplacian cliques can be sparsified!
New Laplacian
≈
! =
Gaussian Elimination
1. Pick a vertex ! to eliminate2. Add the clique created by eliminating !3. Repeat until done
Approximate Gaussian Elimination
1. Pick a vertex ! to eliminate2. Add the clique created by eliminating !3. Repeat until done
Approximate Gaussian Elimination
1. Pick a random vertex ! to eliminate2. Add the clique created by eliminating !3. Repeat until done
Approximate Gaussian Elimination
1. Pick a random vertex ! to eliminate2. Sample the clique created by eliminating !3. Repeat until done
Resembles randomized Incomplete Cholesky
Approximating Matrices by Sampling
Goal !"! ≈ $
Approach1. % !"! = $2. Show !"! concentrated around expectation
Gives !"! ≈ $ w. high probability
Approximating Matrices in Expectation
Consider eliminating the first variable
Original Laplacian
Rank 1term
Remaining graph + clique
! "($) = '( '()
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎
Approximating Matrices in Expectation
Consider eliminating the first variable
Rank 1term
Remaining graph + sparsified clique
! "($) = '$ '$(
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎
Approximating Matrices in Expectation
Consider eliminating the first variable
Rank 1term
Remaining graph + sparsified clique
! "($) = '$ '$(
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎
Approximating Matrices in Expectation
Consider eliminating the first variable
! "($) = '$ '$(
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎Rank 1term
Remaining graph + sparsified clique
Approximating Matrices in Expectation
Consider eliminating the first variable
! "($) = '$ '$(
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎Rank 1term
Remaining graph + sparsified clique
! +($) = '$ '$(
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎Suppose
Approximating Matrices in Expectation
Consider eliminating the first variable
! "($) = '$ '$(
+ ! ∎ ∎∎ ∎
∎∎
∎ ∎ ∎Rank 1term
Remaining graph + sparsified clique
= "(+)Then
Approximating Matrices in Expectation
Let !(#) be our approximation after % eliminations
If we ensure at each step
Then& !(#) = !(#())
& ∎ ∎∎ ∎
∎∎
∎ ∎ ∎= ∎ ∎
∎ ∎∎∎
∎ ∎ ∎Sparsified clique Clique
& ! # −! #() | previous steps = 6
Approximation?
Approximate Gaussian EliminationFind !, upper triangular matrix, s.t.
!"! ≈ $
!"! − $ ≤ 0.5
$*+/-!"!$*+/- − . ≤ 0.5
Essential Tools
Isotropic position !" ≝ $%&/( " $%&/(
Goal is now)*) − , ≈ .
PSD Order/ ≼ 1
iff for all 22*/2 ≤ 2*12
Matrix Concentration: Edge Variables
Zero-mean variables !" = $" − &"
Isotropic position variables '!"
w. probability ("
o.w. $" =
1("&*
1("+
Predictable Quadratic Variation
Predictable quadratic variation = ∑# $ %#&| prev. steps
Want to show ℙ ∑# $ %#&| prev. steps > 1& ≤ 3
Promise: $4%5& ≼ 7895
Sample Variance
!" #$∈ "
&'$ ≼ !"#$∋"
&'$
≼
* *
elim. clique of
Sample Variance
!" #$∈ "
&'$ ≼ !"#$∋"
&'$
≼
* *
elim. clique of
≼ 4&'# vertices
= 4-# vertices
= 2current lap.# vertices
WARNING: only true w.h.p.
= " 4$
Sample Variance
Recall %&'() ≼ "+,(
Putting it together%- .(∈ -
+,( ≼ %-.(∋-
+,(elim. clique of
= 4$# vertices
%- .(∈ -
%&'() ≼elim. clique of # vertices
variance in one round of elimination
Sample Variance
! ≼rounds ofelimination
variance
≼ 4 $ log ( ⋅ *
! $ ⋅ 4*# ver2ces
rounds ofelimination
Summary
Matrix martingales: a natural fit for algorithmic analysis
Understanding the Predictable Quadratic Variation is key
Some results using matrix martingales
Cohen, Musco, Pachocki ’16 – online row sampling
Kyng, Sachdeva ’17 – approximate Gaussian elimination
Kyng, Peng, Pachocki, Sachdeva ’17 – semi-streaming graph sparsification
Cohen, Kelner, Kyng, Peebles, Peng, Rao, Sidford ’18 – solving Directed Laplacian eqs.
Kyng, Song ’18 – Matrix Chernoff bound for negatively dependent random variables
Thanks!
How to Sample a Clique
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
1
23
4
5 54
32
1
23
4
5 54
32
How to Sample a Clique
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
1
23
4
5 54
32
How to Sample a Clique
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
1
23
4
5 54
32
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
How to Sample a Clique
1
23
4
5 54
32
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
How to Sample a Clique
1
23
4
5 54
32
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
How to Sample a Clique
1
23
4
5 54
32
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
How to Sample a Clique
1
23
4
5 54
32
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
How to Sample a Clique
1
23
4
5 54
32
For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,
How to Sample a Clique
Practice
Julia Package: Laplacians.jl
tiny.cc/spielman-solver-code
Theory
Nearly-linear time Directed Laplacian solversusing approximate Gaussian elimination[CKKPPSR17]
This is really the end