Free Probability Theory and Random Matricesweb.mit.edu/sea06/agenda/talks/Speicher_survey.pdf ·...
Transcript of Free Probability Theory and Random Matricesweb.mit.edu/sea06/agenda/talks/Speicher_survey.pdf ·...
Free Probability Theory and Random
Matrices
Roland Speicher
Queen’s University
Kingston, Canada
We are interested in the limiting eigenvalue distribution of
N × N random matrices
for
N → ∞.
Usually, large N distributions are close to the N → ∞ limit, and
asymptotic results give good predictions for finite N .
1
We can consider the convergence for N → ∞ of
• the eigenvalue distribution of one ”typical” realization of the
N × N random matrix
• the averaged eigenvalue distribution over many realizations
of the N × N random matrices
2
Consider (selfadjoint!) Gaussian N × N random matrix.
We have almost sure convergence (convergence of ”typical” re-
alization) of its eigenvalue distribution towards
Wigner’s semicircle.
−3 −2 −1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
N=300
Pro
babi
lity
−3 −2 −1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=1000
Pro
babi
lity
−3 −2 −1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=3000
Pro
babi
lity
3
Convergence of the averaged eigenvalue distribution happens
usually much faster, very good agreement with asymptotic limit
for moderate N .
−3 −2 −1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=5
Pro
babi
lity
−3 −2 −1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=20
Pro
babi
lity
−3 −2 −1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=50P
roba
bilit
y
trials=5000
4
Consider Wishart random matrix A = XX∗, where X is N ×Mrandom matrix with independent Gaussian entries
Its eigenvalue distribution converges (averaged and almostsurely) towards Marchenko-Pastur distribution.
Example: M = 2N , 2000 trials
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
N=10
Pro
babi
lity
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
N=50
Pro
babi
lity
5
We want to consider more complicated situations, built out of
simple cases (like Gaussian or Wishart) by doing operations like
• taking the sum of two matrices
• taking the product of two matrices
• taking corners of matrices
6
Note: If different N × N random matrices A and B are involved
then the eigenvalue distribution of non-trivial functions f(A, B)
(like A+B or AB) will of course depend on the relation between
the eigenspaces of A and of B.
However: It turns out there is a deterministic and treatable result
if
• the eigenspaces are in ”generic” position and
• if N → ∞
This is the realm of free probability theory.
7
Consider N × N random matrices A and C such that
• A has an asymptotic eigenvalue distribution for N → ∞ and
C has an asymptotic eigenvalue distribution for N → ∞
• A and C are independent (i.e., entries of A are independent
from entries of C)
8
Then eigenspaces of A and of C might still be in special relation
(e.g., both A and C could be diagonal).
However, consider now
A and B := UCU∗,
where U is Haar unitary N × N random matrix.
Then, eigenspaces of A and of B are in ”generic” position and
the asymptotic eigenvalue distribution of A+B depends only on
the asymptotic eigenvalue distribution of A and the asymptotic
eigenvalue distribution of B (which is the same as the one of C).
9
We can expect that the asymptotic eigenvalue distribution of
f(A, B) depends only on the asymptotic eigenvalue distribution
of A and the asymptotic eigenvalue distribution of B if
• A and B are independent
• one of them is unitarily invariant
(i.e., the joint distribution of the entries does not change
under unitary conjugation)
Note: Gaussian and Wishart random matrices are unitarily in-
variant
10
Thus: the asymptotic eigenvalue distribution of
• the sum of random matrices in generic position
A + UCU∗
• the product of random matrices in generic position
AUCU∗
• corners of unitarily invariant matrices UCU∗
should only depend on the asymptotic eigenvalue distribution of
A and of C.
11
Example: sum of independent Gaussian and Wishart (M = 2N)
random matrices, averaged over 10000 trials
−3 −2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=5−3 −2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=50
12
Example: product of two independent Wishart (M = 5N) ran-
dom matrices, averaged over 10000 trials
0 0.5 1 1.5 2 2.5 3 3.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
N=50 0.5 1 1.5 2 2.5 3 3.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
N=50
13
Example: upper left corner of size N/2 × N/2 of a randomly
rotated N × N projection matrix,
with half of the eigenvalues 0 and half of the eigenvalues 1,
averaged over 10000 trials
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
N=80.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.5
1
1.5
N=3214
Problems:
• Do we have a conceptual way of understanding
these asymptotic eigenvalue distributions?
• Is there an algorithm for actually calculating these
asymptotic eigenvalue distributions?
15
How do we analyze the eigenvalue distributions?
eigenvalue distributionof matrix A
=̂
knowledge oftraces of powers,
tr(Ak)
1
N
(
λk1 + · · · + λk
N
)
= tr(Ak)
averaged eigenvaluedistribution of
random matrix A=̂
knowledge ofexpectations of
traces of powers,
E[tr(Ak)]
16
Stieltjes inversion formula. If one knows the asymptotic mo-
ments
αk := limN→∞
E[tr(Ak)]
of a random matrix A, then one can get its asymptotic eigenvalue
distribution µ as follows:
Form Cauchy (or Stieltjes) transform
G(z) :=∞∑
k=0
αk
zk+1
Then:
dµ(t) = −1
πlimε→0
ℑG(t + iε)
17
Consider random matrices A and B in generic position.
We want to understand A + B, i.e., for all k ∈ N
E[
tr(
(A + B)k)]
.
But
E[tr((A+B)6)] = E[tr(A6)]+· · ·+E[tr(ABAABA)]+· · ·+E[tr(B6)],
thus we need to understand
mixed moments in A and B
18
Use following notation:
ϕ(A) := limN→∞
E[tr(A)].
Question: If A and B are in generic position, can we understand
ϕ (An1Bm1An2Bm2 · · · )
in terms of
(
ϕ(Ak))
k∈Nand
(
ϕ(Bk))
k∈N
19
Example: Consider two Gaussian random matrices A, B which
are independent (and thus in generic position).
Then the asymptotic mixed moments in A and B
ϕ (An1Bm1An2Bm2 · · · )
are given by
#
{
non-crossing/planar pairings of the pattern
A · A · · ·A︸ ︷︷ ︸
n1-times
·B · B · · ·B︸ ︷︷ ︸
m1-times
·A · A · · ·A︸ ︷︷ ︸
n2-times
·B · B · · ·B︸ ︷︷ ︸
m2-times
· · · ,
which do not pair A with B
}
20
Example: ϕ(AABBABBA) = 2 since there are two such non-
crossing pairings:
A A B B A B B A
and
A A B B A B B A
Note: each of the pairings connects at least one of the groups
An1, Bm1, An2, . . . only among itself!
and thus:
ϕ
((
A2−ϕ(A2)1)(
B2−ϕ(B2)1)(
A−ϕ(A)1)(
B2−ϕ(B2)1)(
A−ϕ(A)1))
= 0
21
In general we have
ϕ((
An1 −ϕ(An1) ·1)·(Bm1 −ϕ(Bm1) ·1
)·(An2 −ϕ(An2) ·1
)· · ·
)
= #
{
non-crossing pairings which do not pair A with B,
and for which each group is connected with some other group
}
=0
22
Actual equation for the calculation of the mixed moments
ϕ1 (An1Bm1An2Bm2 · · · )
is different for different random matrix ensembles.
However, the relation between the mixed moments,
ϕ
((
An1 − ϕ(An1) · 1)
·(
Bm1 − ϕ(Bm1) · 1)
· · ·
)
= 0
remains the same for matrix ensembles in generic position and
constitutes the definition of freeness.
23
Definition [Voiculescu 1985]: A and B are free (with respect
to ϕ) if we have for all n1, m1, n2, · · · ≥ 1 that
ϕ
((
An1−ϕ(An1) ·1)
·(
Bm1−ϕ(Bm1) ·1)
·(
An2−ϕ(An2) ·1)
· · ·
)
= 0
ϕ
((
Bn1−ϕ(Bn1)·1)
·(
Am1−ϕ(Am1)·1)
·(
Bn2−ϕ(Bn2)·1)
· · ·
)
= 0
ϕ
(
alternating product in centered words in A and in B
)
= 0
24
Note: freeness is a rule for calculating mixed moments in A and
B from the moments of A and the moments of B.
Example:
ϕ
((
An − ϕ(An)1)(
Bm − ϕ(Bm)1))
= 0,
thus
ϕ(AnBm)−ϕ(An·1)ϕ(Bm)−ϕ(An)ϕ(1·Bm)+ϕ(An)ϕ(Bm)ϕ(1·1) = 0,
and hence
ϕ(AnBm) = ϕ(An) · ϕ(Bm).
25
Freeness is a rule for calculating mixed moments, analo-
gous to the concept of independence for random variables.
Thus freeness is also called free independence
Note: free independence is a different rule from classical indepen-
dence; free independence occurs typically for non-commuting
random variables.
Example:
ϕ
((
A − ϕ(A)1)
·(
B − ϕ(B)1)
·(
A − ϕ(A)1)
·(
B − ϕ(B)1))
= 0,
which results in
ϕ(ABAB) = ϕ(AA) · ϕ(B) · ϕ(B) + ϕ(A) · ϕ(A) · ϕ(BB)
− ϕ(A) · ϕ(B) · ϕ(A) · ϕ(B)
26
Consider A, B free.
Then, by freeness, the moments of A+B are uniquely determined
by the moments of A and the moments of B.
Notation: We say the distribution of A + B is the
free convolution
of the distribution of A and the distribution of B,
µA+B = µA ⊞ µB.
27
In principle, freeness determines this, but the concrete nature of
this rule is not clear.
Examples: We have
ϕ(
(A + B)1)
= ϕ(A) + ϕ(B)
ϕ(
(A + B)2)
= ϕ(A2) + 2ϕ(A)ϕ(B) + ϕ(B2)
ϕ(
(A + B)3)
= ϕ(A3) + 3ϕ(A2)ϕ(B) + 3ϕ(A)ϕ(B2) + ϕ(B3)
ϕ(
(A + B)4)
= ϕ(A4) + 4ϕ(A3)ϕ(B) + 4ϕ(A2)ϕ(B2)
+ 2(
ϕ(A2)ϕ(B)ϕ(B) + ϕ(A)ϕ(A)ϕ(B2)
− ϕ(A)ϕ(B)ϕ(A)ϕ(B))
+ 4ϕ(A)ϕ(B3) + ϕ(B4)
28
To treat these formulas in general, linearize the free convolution
by going over from moments (ϕ(Am))m∈N to free cumulants
(κm)m∈N.
Those are defined by relations like:
ϕ(A1) = κ1
ϕ(A2) = κ2 + κ21
ϕ(A3) = κ3 + 3κ1κ2 + κ31
ϕ(A4) = κ4 + 4κ1κ3 + 2κ22 + 6κ2
1κ2 + κ41
...
29
There is a combinatorial structure behind these formulas, the
sums are running over non-crossing partitions:
ϕ(A1) = ϕ(A2) = +
= κ1 = κ2 + κ1κ1
ϕ(A3) = + + + +
= κ3 + κ1κ2 + κ2κ1 + κ2κ1 + κ1κ1κ1
ϕ(A4) = + + + + + +
+ + + + + + +
= κ4 + 4κ1κ3 + 2κ22 + 6κ2
1κ2 + κ41
30
This combinatorial relation between moments (ϕ(Am))m∈N and
cumulants (κm)m∈N can be translated into generating power se-
ries.
Put
G(z) =1
z+
∞∑
m=1
ϕ(Am)
zm+1Cauchy transform
and
R(z) =∞∑
m=1
κmzm−1R-transform
Then we have the relation
1
G(z)+ R(G(z)) = z.
31
Theorem [Voiculescu 1986, Speicher 1994]:
Let A and B be free. Then one has
RA+B(z) = RA(z) + RB(z),
or equivalently
κA+Bm = κA
m + κBm ∀m.
32
This, together with the relation between Cauchy transform and
R-transform and with the Stieltjes inversion formula, gives an
effective algorithm for calculating free convolutions, i.e., sums
of random matrices in generic position.
A GA RA
↓
RA +RB = RA+B GA+B A + B
↑
B GB RB
33
Example: Wigner + Wishart (M = 2N), trials = 4000
−3 −2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N=10034
One has similar analytic description for product.
Theorem [Voiculescu 1987, Haagerup 1997, Nica +
Speicher 1997]:
Put
MA(z) :=∞∑
m=0
ϕ(Am)zm
and define
SA(z) :=1 + z
zM<−1>
A (z) S-transform of A
Then: If A and B are free, we have
SAB(z) = SA(z) · SB(z).
35
Example: Wishart x Wishart (M = 5N), trials=1000
0.5 1 1.5 2 2.5 3 3.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
N=10036
upper left corner of size N/2 × N/2 of a projection matrix,
with N/2 eigenvalues 0 and N/2 eigenvalues 1; trials=5000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
N=6437
• Free Calculator by Raj Rao and Alan Edelman
• A. Nica and R. Speicher: Lectures on the Combinatorics of
Free Probability.
To appear soon in the London Mathematical Society Lecture
Note Series, vol. 335, Cambridge University Press
38
Outlook on other talks around free probability
• Anshelevich: ”free” orthogonal and Meixner polynomials
• Burda: free random Levy matrices
• Chatterjee: concentration of measures and free probability
• Demni: free stochastic processes
• Kargin: large deviations in free probability
• Mingo + Speicher: fluctuations of random matrices
• Rashidi Far: operator-valued free probability theory and block
matrices
39