Matrix Martingales in Randomized Numerical Linear Algebra

Post on 27-Jun-2022

4 views 0 download

Transcript of Matrix Martingales in Randomized Numerical Linear Algebra

Matrix Martingales in Randomized Numerical Linear Algebra

Speaker Rasmus Kyng Harvard University

Sep 27, 2018

Concentration of Scalar Random Variables

Random ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 0

Is ! ≈ 0 with high probability?

Concentration of Scalar Random Variables

Bernstein’s InequalityRandom ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 03. !$ ≤ +4. ∑$ (!$, ≤ -,

gives

ℙ[ ! > 1] ≤ 2exp − 1,/2+1 + -, ≤ :

E.g. if 1 = 0.5, and +, -, = 0.1/log(1/:)

Concentration of Scalar Martingales

Random ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 0 ( !$|!+, … , !$-+ = 0

Concentration of Scalar Martingales

Random ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 0 ( !$| previous steps = 0

Concentration of Scalar MartingalesFreedman’s InequalityBernstein’s InequalityRandom ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. (!$ = 03. !$ ≤ +4. ∑$ (!$, ≤ -,

gives

ℙ[ ! > 1] ≤ 2 exp − 1,/2+1 + -,

( !$| previous steps = 0

∑$ ( !$,| previous steps ≤ -,

+B

ℙ ∑$ ( !$,| previous steps > -, ≤ B

Concentration of Scalar Martingales

Freedman’s InequalityRandom ! = ∑$ !$ , !$ ∈ ℝ1. !$ are independent2. ( !$| previous steps = 03. !$ ≤ 54. ℙ[∑$ ( !$8| previous steps > :8] ≤ <

gives

ℙ[ ! > =] ≤ 2exp − =8/25= + :8 + <

Concentration of Matrix Random Variables

Matrix Bernstein’s Inequality (Tropp 11)Random ! = ∑$ !$ !$ ∈ ℝ'×', symmetric1. !$ are independent2. *!$ = +3. !$ ≤ /4. ∑$ *!$0 ≤ 10

gives

ℙ[ ! > 5] ≤ 7 2exp − =>/0@=AB>

Concentration of Matrix Martingales

Matrix Freedman’s Inequality (Tropp 11)Random ! = ∑$ !$ !$ ∈ ℝ'×', symmetric1. !$ are independent2. *!$ = 0 * !$| previous steps = 03. !$ ≤ 94. ∑$ *!$: ≤ ;: ℙ[ ∑$ * !$:| prev. steps > ;:] ≤ @

gives

ℙ[ ! > A] ≤ B 2exp − FG/:IFJKG

+ @

E.g. if A = 0.5, and 9, ;: = 0.1/log(B/R)

≤ R + @

Concentration of Matrix Martingales

Matrix Freedman’s Inequality (Tropp 11)

ℙ[ ∑$ % &$'| prev. steps > 1'] ≤ 4

Predictable quadratic variation =6

$% &$'| prev. steps

Laplacian Matrices

! "Graph # = (&, ()Edge weights *:( → ℝ./ = &0 = |(|

2

/×/ matrix 4 = 56∈8

46

Laplacian Matrices

! "Graph # = (&, ()Edge weights *:( → ℝ./ = &0 = |(|

2

3 = 45∈7

35

L(a,b) = w(a,b)

. . . a . . . b . . .�

�������

�������

...a 1 �1...b �1 1...

/×/ matrix

w(a,b)

Laplacian Matrices

!: weighted adjacency matrix of the graph

#: diagonal matrix of weighted degrees

$ = # − !

!'( = )'(

#'' = ∑( )'(

Graph + = (-, /)Edge weights ):/ → ℝ34 = -5 = |/|

Laplacian Matrices

Symmetric matrix !

All off-diagonals are non-positive and

!"" =$%&"

!"%

Laplacian of a Graph

�1 �1 0

�1 1 00 0 0

�0 0 00 1 �10 �1 1

�2�

�2 0 �20 0 0

�2 0 2

11

�3 �1 �2

�1 2 �1�2 �1 3

Laplacian Matrices

[ST04]: solving Laplacian linear equations in !" # time

[KS16]: simple algorithm

Solving a Laplacian Linear Equation

!" = $

Gaussian EliminationFind %, upper triangular matrix, s.t.

%&% = !Then

" = %'(%'&$

Easy to apply %'( and %'&

Solving a Laplacian Linear Equation

!" = $

Approximate Gaussian EliminationFind %, upper triangular matrix, s.t.

%&% ≈ !

% is sparse.

( log ,- iterations to get.-approximate solution /".

Approximate Gaussian Elimination

Theorem [KS]When ! is an Laplacian matrix with " non-zeros,we can find in #(" log( )) time an upper triangular matrix + with #(" log( )) non-zeros,s.t. w.h.p.

+,+ ≈ !

Additive View of Gaussian Elimination

Find !, upper triangular matrix, s.t !"! = $

0

BB@

16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7

1

CCA$ =

���

16 �4 �8 �4�4 1 2 1�8 2 4 2�4 1 2 1

���

���

16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7

���

���

16 �4 �8 �4�4 1 2 1�8 2 4 2�4 1 2 1

��� =

���

4�1�2�1

���

���

4�1�2�1

���

Find the rank-1 matrix that agrees with !on the first row and column.

Additive View of Gaussian Elimination

���

16 �4 �8 �4�4 1 2 1�8 2 4 2�4 1 2 1

���

���

16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7

��� �

���

0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6

���

Subtract the rank 1 matrix.We have eliminated the first variable.

=

Additive View of Gaussian Elimination

The remaining matrix is PSD.

���

0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6

���

Additive View of Gaussian Elimination

���

0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6

���

Find rank-1 matrix that agrees withour matrix on the next row and column.

���

0 0 0 00 4 �2 �20 �2 1 10 �2 1 1

��� =

���

02

�1�1

���

���

02

�1�1

���

Additive View of Gaussian Elimination

Subtract the rank 1 matrix.We have eliminated the second variable.

���

0 0 0 00 4 �2 �20 �2 10 �20 �2 �2 6

���

���

0 0 0 00 4 �2 �20 �2 1 10 �2 1 1

���

���

0 0 0 00 0 0 00 0 9 �30 0 �3 5

���=

Additive View of Gaussian Elimination

0

BB@

16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7

1

CCA! =

Repeat until all parts written as rank 1 terms.

=

���

4�1�2�1

���

���

4�1�2�1

���

+

���

02

�1�1

���

���

02

�1�1

���

+

���

003

�1

���

���

003

�1

���

+

���

0002

���

���

0002

���

Additive View of Gaussian Elimination

0

BB@

16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7

1

CCA

=

���

4 0 0 0�1 2 0 0�2 �1 3 0�1 �1 �1 2

���

���

4 �1 �2 �10 2 �1 �10 0 3 �10 0 0 2

���

Additive View of Gaussian Elimination

Repeat until all parts written as rank 1 terms.

! =

0

BB@

16 �4 �8 �4�4 5 0 �1�8 0 14 0�4 �1 0 7

1

CCA

=

���

4 �1 �2 �10 2 �1 �10 0 3 �10 0 0 2

���

� �

���

4 �1 �2 �10 2 �1 �10 0 3 �10 0 0 2

��� = "# "

Additive View of Gaussian Elimination

Repeat until all parts written as rank 1 terms.

$ =

Additive View of Gaussian Elimination

What is special about Gaussian Elimination on Laplacians?

The remaining matrix is always Laplacian.

L =

���

16 �8 �4 �4�8 8 0 0�4 0 4 0�4 0 0 4

���! =

Additive View of Gaussian Elimination

What is special about Gaussian Elimination on Laplacians?

The remaining matrix is always Laplacian.

L =

���

16 �8 �4 �4�8 4 2 2�4 2 1 1�4 2 1 1

��� +

���

0 0 0 00 4 �2 �20 �2 3 �10 �2 �1 3

���! =

Additive View of Gaussian Elimination

What is special about Gaussian Elimination on Laplacians?

The remaining matrix is always Laplacian.

+

���

0 0 0 00 4 �2 �20 �2 3 �10 �2 �1 3

���

A new Laplacian!

4 �2 �2�2 3 �1�2 �1 3

L =

���

4�2�1�1

���

���

4�2�1�1

���

! =

Why is Gaussian Elimination Slow?

Solving !" = $ by Gaussian Elimination can take Ω &' time.

The main issue is fill

(! =

Why is Gaussian Elimination Slow?

Solving !" = $ by Gaussian Elimination can take Ω &' time.

The main issue is fill

(! =

Solving !" = $ by Gaussian Elimination can take Ω &' time.

The main issue is fill

Why is Gaussian Elimination Slow?

New Laplacian(

Elimination creates a cliqueon the neighbors of (

! =

Solving !" = $ by Gaussian Elimination can take Ω &' time.

The main issue is fill

Why is Gaussian Elimination Slow?

(

Laplacian cliques can be sparsified!

New Laplacian

! =

Gaussian Elimination

1. Pick a vertex ! to eliminate2. Add the clique created by eliminating !3. Repeat until done

Approximate Gaussian Elimination

1. Pick a vertex ! to eliminate2. Add the clique created by eliminating !3. Repeat until done

Approximate Gaussian Elimination

1. Pick a random vertex ! to eliminate2. Add the clique created by eliminating !3. Repeat until done

Approximate Gaussian Elimination

1. Pick a random vertex ! to eliminate2. Sample the clique created by eliminating !3. Repeat until done

Resembles randomized Incomplete Cholesky

Approximating Matrices by Sampling

Goal !"! ≈ $

Approach1. % !"! = $2. Show !"! concentrated around expectation

Gives !"! ≈ $ w. high probability

Approximating Matrices in Expectation

Consider eliminating the first variable

Original Laplacian

Rank 1term

Remaining graph + clique

! "($) = '( '()

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎

Approximating Matrices in Expectation

Consider eliminating the first variable

Rank 1term

Remaining graph + sparsified clique

! "($) = '$ '$(

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎

Approximating Matrices in Expectation

Consider eliminating the first variable

Rank 1term

Remaining graph + sparsified clique

! "($) = '$ '$(

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎

Approximating Matrices in Expectation

Consider eliminating the first variable

! "($) = '$ '$(

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎Rank 1term

Remaining graph + sparsified clique

Approximating Matrices in Expectation

Consider eliminating the first variable

! "($) = '$ '$(

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎Rank 1term

Remaining graph + sparsified clique

! +($) = '$ '$(

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎Suppose

Approximating Matrices in Expectation

Consider eliminating the first variable

! "($) = '$ '$(

+ ! ∎ ∎∎ ∎

∎∎

∎ ∎ ∎Rank 1term

Remaining graph + sparsified clique

= "(+)Then

Approximating Matrices in Expectation

Let !(#) be our approximation after % eliminations

If we ensure at each step

Then& !(#) = !(#())

& ∎ ∎∎ ∎

∎∎

∎ ∎ ∎= ∎ ∎

∎ ∎∎∎

∎ ∎ ∎Sparsified clique Clique

& ! # −! #() | previous steps = 6

Approximation?

Approximate Gaussian EliminationFind !, upper triangular matrix, s.t.

!"! ≈ $

!"! − $ ≤ 0.5

$*+/-!"!$*+/- − . ≤ 0.5

Essential Tools

Isotropic position !" ≝ $%&/( " $%&/(

Goal is now)*) − , ≈ .

PSD Order/ ≼ 1

iff for all 22*/2 ≤ 2*12

Matrix Concentration: Edge Variables

Zero-mean variables !" = $" − &"

Isotropic position variables '!"

w. probability ("

o.w. $" =

1("&*

1("+

Predictable Quadratic Variation

Predictable quadratic variation = ∑# $ %#&| prev. steps

Want to show ℙ ∑# $ %#&| prev. steps > 1& ≤ 3

Promise: $4%5& ≼ 7895

Sample Variance

!" #$∈ "

&'$ ≼ !"#$∋"

&'$

* *

elim. clique of

Sample Variance

!" #$∈ "

&'$ ≼ !"#$∋"

&'$

* *

elim. clique of

≼ 4&'# vertices

= 4-# vertices

= 2current lap.# vertices

WARNING: only true w.h.p.

= " 4$

Sample Variance

Recall %&'() ≼ "+,(

Putting it together%- .(∈ -

+,( ≼ %-.(∋-

+,(elim. clique of

= 4$# vertices

%- .(∈ -

%&'() ≼elim. clique of # vertices

variance in one round of elimination

Sample Variance

! ≼rounds ofelimination

variance

≼ 4 $ log ( ⋅ *

! $ ⋅ 4*# ver2ces

rounds ofelimination

Summary

Matrix martingales: a natural fit for algorithmic analysis

Understanding the Predictable Quadratic Variation is key

Some results using matrix martingales

Cohen, Musco, Pachocki ’16 – online row sampling

Kyng, Sachdeva ’17 – approximate Gaussian elimination

Kyng, Peng, Pachocki, Sachdeva ’17 – semi-streaming graph sparsification

Cohen, Kelner, Kyng, Peebles, Peng, Rao, Sidford ’18 – solving Directed Laplacian eqs.

Kyng, Song ’18 – Matrix Chernoff bound for negatively dependent random variables

Thanks!

How to Sample a Clique

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

1

23

4

5 54

32

1

23

4

5 54

32

How to Sample a Clique

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

1

23

4

5 54

32

How to Sample a Clique

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

1

23

4

5 54

32

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

How to Sample a Clique

1

23

4

5 54

32

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

How to Sample a Clique

1

23

4

5 54

32

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

How to Sample a Clique

1

23

4

5 54

32

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

How to Sample a Clique

1

23

4

5 54

32

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

How to Sample a Clique

1

23

4

5 54

32

For each edge (1, $)pick an edge 1, & with probability ~ ()insert edge $, & with weight *+*,*+-*,

How to Sample a Clique

Practice

Julia Package: Laplacians.jl

tiny.cc/spielman-solver-code

Theory

Nearly-linear time Directed Laplacian solversusing approximate Gaussian elimination[CKKPPSR17]

This is really the end