A Matrix Expander Chernoff Boundnikhil/mchernoff.pdfΒ Β· Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil...
Transcript of A Matrix Expander Chernoff Boundnikhil/mchernoff.pdfΒ Β· Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil...
-
A Matrix Expander Chernoff Bound
Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil Srivastava
UC Berkeley
-
Vanilla Chernoff Bound
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
-
Vanilla Chernoff Bound
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
Two Extensions:1.Dependent Random Variables2.Sums of random matrices
-
Expander Chernoff Bound [AKSβ87, Gβ94]
Thm[Gillmanβ94]: Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βbe a function with π π£ β€ 1 and Οπ£ π π£ = 0.
-
Expander Chernoff Bound [AKSβ87, Gβ94]
Thm[Gillmanβ94]: Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βbe a function with π π£ β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2exp(βπ(1 β π)ππ2)
Implies walk of length π β 1 β π β1 concentrates around mean.
-
Intuition for Dependence on Spectral Gap
π = +1 π = β1
1 β π =1
π2
-
Intuition for Dependence on Spectral Gap
1 β π =1
π2
Typical random walk takes Ξ© π2 steps to see both Β±1.
-
Derandomization Motivation
Thm[Gillmanβ94]: Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βbe a function with π π£ β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2exp(βπ(1 β π)ππ2)
To generate π indep samples from [π] need π log π bits.If πΊ is 3 βregular with constant π, walk needs: log π + π log(3) bits.
-
Derandomization Motivation
Thm[Gillmanβ94]: Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βbe a function with π π£ β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2exp(βπ(1 β π)ππ2)
To generate π indep samples from [π] need π log π bits.If πΊ is 3 βregular with constant π, walk needs: log π + π log(3) bits.
When π = π(log π) reduces randomness quadratically. Can completely derandomize in polynomial time.
-
Matrix Chernoff Bound
Thm [Rudelsonβ97, Ahlswede-Winterβ02, Oliveiraβ08, Troppβ11β¦]. If π1, β¦ , ππ are independent mean zero random π Γ π Hermitianmatrices with | ππ | β€ 1 then
β1
π
π
ππ β₯ π β€ 2πexp(βππ2/4)
-
Matrix Chernoff Bound
Thm [Rudelsonβ97, Ahlswede-Winterβ02, Oliveiraβ08, Troppβ11β¦]. If π1, β¦ , ππ are independent mean zero random π Γ π Hermitianmatrices with | ππ | β€ 1 then
β1
π
π
ππ β₯ π β€ 2πexp(βππ2/4)
Very generic bound (no independence assumptions on the entries).Many applications + martingale extensions (see Tropp).
Factor π is tight because of the diagonal case.
-
Matrix Expander Chernoff Bound?
Conj[Wigderson-Xiaoβ05]: Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βπΓπ be a function with | π π£ | β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2πexp(βπ(1 β π)ππ2)
Motivated by derandomized Alon-Roichman theorem.
-
Main Theorem
Thm. Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βπΓπ be a function with | π π£ | β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2πexp(βπ(1 β π)ππ2)
Gives black-box derandomization of any application of matrix Chernoff
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
Markov
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
Indep.
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
Bounded
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
Mean zero
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If π1, β¦ , ππ are independent mean zerorandom variables with ππ β€ 1 then
β1
π
π
ππ β₯ π β€ 2exp(βππ2/4)
β
π
ππ β₯ ππ β€ πβπ‘πππΌ exp π‘
π
ππ = πβπ‘ππΰ·
π
πΌππ‘ππ
β€ πβπ‘ππ 1 + π‘πΌππ + π‘2 π β€ πβπ‘ππ+ππ‘
2β€ eβ
ππ2
4
-
2. Proof of Expander Chernoff
Goal: Show πΌexp π‘ Οπ π π£π β€ exp ππππ‘2
-
2. Proof of Expander Chernoff
Goal: Show πΌexp π‘ Οπ π π£π β€ exp ππππ‘2
Issue: πΌexp Οπ π π£π β Οπ πΌexp(π‘π(π£π))
How to control the mgf without independence?
-
Step 1: Write mgf as quadratic form
=
π0,β¦,ππβπ
β(π£0 = π0)π π0, π1 β¦π(ππβ1, ππ) exp π‘
1β€πβ€π
π ππ
=1
π
π0,β¦,ππβπ
π π0, π1 ππ‘π(π1)β¦π ππβ1, ππ π
π‘π(ππ)
πΌππ‘ Οπβ€π π(π£π)
= β¨π’, πΈπ ππ’β© where π’ =1
π, β¦ ,
1
π. ππ
ππ
ππ
ππ
π π0, π1
π π1, π2
π π2, π3
ππ‘π(π1)
ππ‘π(π2)
ππ‘π(π3)
πΈ =ππ‘π(1)
ππ‘π(π)
-
Step 1: Write mgf as quadratic form
=
π0,β¦,ππβπ
β(π£0 = π0)π π0, π1 β¦π(ππβ1, ππ) exp π‘
1β€πβ€π
π ππ
=1
π
π0,β¦,ππβπ
π π0, π1 ππ‘π(π1)β¦π ππβ1, ππ π
π‘π(ππ)
πΌππ‘ Οπβ€π π(π£π)
= β¨π’, πΈπ ππ’β© where π’ =1
π, β¦ ,
1
π. ππ
ππ
ππ
ππ
π π0, π1
π π1, π2
π π2, π3
ππ‘π(π1)
ππ‘π(π2)
ππ‘π(π3)
πΈ =ππ‘π(1)
ππ‘π(π)
-
Step 1: Write mgf as quadratic form
=
π0,β¦,ππβπ
β(π£0 = π0)π π0, π1 β¦π(ππβ1, ππ) exp π‘
1β€πβ€π
π ππ
=1
π
π0,β¦,ππβπ
π π0, π1 ππ‘π(π1)β¦π ππβ1, ππ π
π‘π(ππ)
πΌππ‘ Οπβ€π π(π£π)
= β¨π’, πΈπ ππ’β© where π’ =1
π, β¦ ,
1
π. ππ
ππ
ππ
ππ
π π0, π1
π π1, π2
π π2, π3
ππ‘π(π1)
ππ‘π(π2)
ππ‘π(π3)
πΈ =ππ‘π(1)
β¦ππ‘π(π)
-
Step 2: Bound quadratic form
Observe: ||π β π½|| β€ π where π½ = complete graph with self loops
So for small π, should have π’, πΈπ ππ’ β β¨π’, (πΈπ½)ππ’β©
Goal: Show β¨π’, πΈπ ππ’β© β€ exp ππππ‘2
-
Step 2: Bound quadratic form
Observe: ||π β π½|| β€ π where π½ = complete graph with self loops
So for small π, should have π’, πΈπ ππ’ β β¨π’, (πΈπ½)ππ’β©
Approach 1. Use perturbation theory to show ||πΈπ|| β€ exp πππ‘2
Goal: Show β¨π’, πΈπ ππ’β© β€ exp ππππ‘2
-
Step 2: Bound quadratic form
Observe: ||π β π½|| β€ π where π½ = complete graph with self loops
So for small π, should have π’, πΈπ ππ’ β β¨π’, (πΈπ½)ππ’β©
Approach 1. Use perturbation theory to show ||πΈπ|| β€ exp πππ‘2
Approach 2. (Healyβ08) track projection of iterates along π’.
Goal: Show β¨π’, πΈπ ππ’β© β€ exp ππππ‘2
-
Simplest case: π = 0
πΈππΈππΈππΈππ’
-
Simplest case: π = 0
πΈππΈππΈππΈππ’
-
Simplest case: π = 0
πΈππΈππΈππΈππ’
-
Simplest case: π = 0
πΈππΈππΈππΈππ’
-
Simplest case: π = 0
πΈππΈππΈππΈππ’
-
Observations
πΈππΈππΈππΈππ’
Observe: π shrinks every vector orthogonal to π’ by π.
π’, πΈπ’ =1
nΟπ£βV π
π‘π(π£) = 1 + π‘2 by mean zero
condition.
-
A small dynamical system
π£ = πΈπ ππ’
π|| = β¨π£, π’β©
πβ₯ = ||πβ₯π£||
π|| πβ₯
-
A small dynamical system
π£ = πΈπ ππ’
π|| = β¨π£, π’β©
πβ₯ = ||πβ₯π£||
π|| πβ₯
0
0
ππ‘2
π‘
π£ β· πΈππ£π = 0
-
A small dynamical system
π£ = πΈπ ππ’
π|| = β¨π£, π’β©
πβ₯ = ||πβ₯π£||
π|| πβ₯
ππ‘π
ππ‘
ππ‘2
π‘
π£ β· πΈππ£any π
-
A small dynamical system
π£ = πΈπ ππ’
π|| = β¨π£, π’β©
πβ₯ = ||πβ₯π£||
π|| πβ₯
β€ 1
ππ‘
ππ‘2
π‘
π£ β· πΈππ£π‘ β€ log 1/π
-
A small dynamical system
π£ = πΈπ ππ’
π|| = β¨π£, π’β©
πβ₯ = ||πβ₯π£||
π|| πβ₯
β€ 1
ππ‘
ππ‘2
π‘
Any mass that leaves gets shrunk by π
-
Analyzing dynamical system gives
Thm[Gillmanβ94]: Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βbe a function with π π£ β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2exp(βπ(1 β π)ππ2)
-
Main Issue: exp π΄ + π΅ β exp π΄ exp(π΅) unless π΄, π΅ = 0
Generalization to Matrices?
Setup: π: π β βπΓπ , random walk π£1, β¦ , π£π.Goal:
πΌππ exp π‘
π
π π£π β€ π β exp πππ‘2
where ππ΄ is defined as a power series.
canβt express exp(sum) as iterated product.
-
Main Issue: exp π΄ + π΅ β exp π΄ exp(π΅) unless π΄, π΅ = 0
Generalization to Matrices?
Setup: π: π β βπΓπ , random walk π£1, β¦ , π£π.Goal:
πΌππ exp π‘
π
π π£π β€ π β exp πππ‘2
where ππ΄ is defined as a power series.
canβt express exp(sum) as iterated product.
-
The Golden-Thompson InequalityPartial Workaround [Golden-Thompsonβ65]:
ππ exp π΄ + π΅ β€ ππ exp π΄ exp π΅
Sufficient for independent case by induction.For expander case, need this for π matrices. False!
ππ ππ΄+π΅+πΆ > 0 > ππ(eAeBeC)
β ππ΄
β ππ΅ β ππΆ
-
The Golden-Thompson InequalityPartial Workaround [Golden-Thompsonβ65]:
ππ exp π΄ + π΅ β€ ππ exp π΄ exp π΅
Sufficient for independent case by induction.For expander case, need this for π matrices.
ππ ππ΄+π΅+πΆ > 0 > ππ(eAeBeC)
β ππ΄
β ππ΅ β ππΆ
-
The Golden-Thompson InequalityPartial Workaround [Golden-Thompsonβ65]:
ππ exp π΄ + π΅ β€ ππ exp π΄ exp π΅
Sufficient for independent case by induction.For expander case, need this for π matrices. False!
ππ ππ΄+π΅+πΆ > 0 > ππ(eAeBeC)
β ππ΄
β ππ΅ β ππΆ
-
Key Ingredient
[Sutter-Berta-Tomamichelβ16] If π΄1, β¦ , π΄π are Hermitian, then
log ππ ππ΄1+ β¦+π΄π
β€ β« ππ½(π) log ππ ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2 ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2
β
where π½(π) is an explicit probability density on β.
1. Matrix on RHS is always PSD.
2. Average-case inequality: ππ΄π/2 are conjugated by unitaries.3. Implies Liebβs concavity, triple-matrix, ALT, and more.
-
Key Ingredient
[Sutter-Berta-Tomamichelβ16] If π΄1, β¦ , π΄π are Hermitian, then
log ππ ππ΄1+ β¦+π΄π
β€ β« ππ½(π) log ππ ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2 ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2
β
where π½(π) is an explicit probability density on β.
1. Matrix on RHS is always PSD.
2. Average-case inequality: ππ΄π/2 are conjugated by unitaries.3. Implies Liebβs concavity, triple-matrix, ALT, and more.
-
Proof of SBT: Lie-Trotter Formula
ππ΄+π΅+πΆ = limπβ0+
πππ΄πππ΅πππΆ1/π
log ππ ππ΄+π΅+πΆ = limπβ0+
log ||πΊ π ||2/π
For πΊ π§ β ππ§π΄
2 ππ§π΅
2 ππ§πΆ
2
-
Proof of SBT: Lie-Trotter Formula
ππ΄+π΅+πΆ = limπβ0+
πππ΄πππ΅πππΆ1/π
log ππ ππ΄+π΅+πΆ = limπβ0+
2log ||πΊ π ||2/π/π
For πΊ π§ β ππ§π΄
2 ππ§π΅
2 ππ§πΆ
2
-
Complex Interpolation (Stein-Hirschman)
log πΊ π 2/π
π
-
Complex Interpolation (Stein-Hirschman)
|πΉ π | = πΊ π 2/π
For each π, find analytic πΉ(π§) st:
|πΉ 1 + ππ‘ | β€ πΊ 1 + ππ‘ 2
|πΉ ππ‘ | = 1
-
Complex Interpolation (Stein-Hirschman)
|πΉ π | = πΊ π 2/π
For each π, find analytic πΉ(π§) st:
|πΉ 1 + ππ‘ | β€ πΊ 1 + ππ‘ 2
|πΉ ππ‘ | = 1
log πΉ π β€ ΰΆ±log |πΉ ππ‘ | + ΰΆ±log |πΉ 1 + ππ‘ |
-
Complex Interpolation (Stein-Hirschman)
|πΉ π | = πΊ π 2/π
For each π, find analytic πΉ(π§) st:
|πΉ 1 + ππ‘ | β€ πΊ 1 + ππ‘ 2
|πΉ ππ‘ | = 1
log πΉ π β€ ΰΆ±log |πΉ ππ‘ | + ΰΆ±log |πΉ 1 + ππ‘ |
lim π β 0
-
Key Ingredient
[Sutter-Berta-Tomamichelβ16] If π΄1, β¦ , π΄π are Hermitian, then
log ππ ππ΄1+ β¦+π΄π
β€ β« ππ½(π) log ππ ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2 ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2
β
where π½(π) is an explicit probability density on β.
-
Key Ingredient
[Sutter-Berta-Tomamichelβ16] If π΄1, β¦ , π΄π are Hermitian, then
log ππ ππ΄1+ β¦+π΄π
β€ β« ππ½(π) log ππ ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2 ππ΄1(1+ππ)
2 β¦ππ΄π(1+ππ)
2
β
where π½(π) is an explicit probability density on β.
Issue. SBT involves integration over unbounded region, bad for Taylor expansion.
-
Bounded Modification of SBT
Solution. Prove bounded version of SBT by replacing stripwith half-disk.
[Thm] If π΄1, β¦ , π΄π are Hermitian, then
log ππ ππ΄1+β¦+π΄π
β€ β« ππ½(π) log ππ ππ΄1π
ππ
2 β¦ππ΄ππ
ππ
2 ππ΄1π
ππ
2 β¦ππ΄ππ
ππ
2
β
where π½(π) is an explicit probability density on
βπ
2,π
2.
π
Proof. Analytic πΉ(π§) + Poisson Kernel + Riemann map.
-
Handling Two-sided Products
Issue. Two-sided rather than one-sided products:
ππ ππ‘π π£1 π
ππ
2 β¦ππ‘π π£π π
ππ
2 ππ‘π π£1 π
ππ
2 β¦ππ‘π π£π π
ππ
2
β
Solution.Encode as one-sided product by using ππ π΄ππ΅ = π΄βπ΅π π£ππ π :
β¨ππ‘π π£1 π
ππ
2 β ππ‘π π£1
βπππ
2 β¦ππ‘π π£π
βππβππ
2 βππ‘π π£π
βππβππ
2 vec Id , vec Id β©
-
Handling Two-sided Products
Issue. Two-sided rather than one-sided products:
ππ ππ‘π π£1 π
ππ
2 β¦ππ‘π π£π π
ππ
2 ππ‘π π£1 π
ππ
2 β¦ππ‘π π£π π
ππ
2
β
Solution.Encode as one-sided product by using ππ π΄ππ΅ = π΄βπ΅π π£ππ π :
β¨ππ‘π π£1 π
ππ
2 βππ‘π π£1
βππππ
2 β¦ππ‘π π£π
βππβππ
2 βππ‘π π£π
βππβππ
2 vec Id , vec Id β©
-
Finishing the Proof
Carry out a version of Healyβs argument with π β πΌπ2 and:
And π£ππ(πΌπ) β π’ instead of π’.
This leads to the additional π factor.
ππ
ππ
ππ
ππ
π π0, π1
π π1, π2
π π2, π3
πΈ =ππ‘π 1 πππ
2 βππ‘π 1 βππππ
2
β¦
ππ‘π(π)πππ
2 βππ‘π π βππβππ
2
-
Main Theorem
Thm. Suppose πΊ = (π, πΈ) is a regular graph with transition matrix π which has second eigenvalue π. Let π: π β βπΓπ be a function with | π π£ | β€ 1 and Οπ£ π π£ = 0.Then, if π£1, β¦ , π£π is a stationary random walk:
β1
π
π
π(π£π) β₯ π β€ 2πexp(βπ(1 β π)ππ2)
-
Open Questions
Other matrix concentration inequalities (multiplicative, low-rank, moments)
Other Banach spaces(Schatten norms)
More applications of complex interpolation