Free Probability Theory and Random Matricesweb.mit.edu/sea06/agenda/talks/Speicher_survey.pdf ·...

Free Probability Theory and Random

Matrices

Roland Speicher

Queen’s University

Kingston, Canada

We are interested in the limiting eigenvalue distribution of

N × N random matrices

for

N → ∞.

Usually, large N distributions are close to the N → ∞ limit, and

asymptotic results give good predictions for finite N .

1

We can consider the convergence for N → ∞ of

• the eigenvalue distribution of one ”typical” realization of the

N × N random matrix

• the averaged eigenvalue distribution over many realizations

of the N × N random matrices

2

Consider (selfadjoint!) Gaussian N × N random matrix.

We have almost sure convergence (convergence of ”typical” re-

alization) of its eigenvalue distribution towards

Wigner’s semicircle.

−3 −2 −1 0 1 2 30

0.05

0.1

0.15

0.2

0.25

0.3

N=300

Pro

babi

lity

−3 −2 −1 0 1 2 30

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=1000

Pro

babi

lity

−3 −2 −1 0 1 2 30

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=3000

Pro

babi

lity

3

Convergence of the averaged eigenvalue distribution happens

usually much faster, very good agreement with asymptotic limit

for moderate N .

−3 −2 −1 0 1 2 30

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=5

Pro

babi

lity

−3 −2 −1 0 1 2 30

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=20

Pro

babi

lity

−3 −2 −1 0 1 2 30

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=50P

roba

bilit

y

trials=5000

4

Consider Wishart random matrix A = XX∗, where X is N ×Mrandom matrix with independent Gaussian entries

Its eigenvalue distribution converges (averaged and almostsurely) towards Marchenko-Pastur distribution.

Example: M = 2N , 2000 trials

−0.5 0 0.5 1 1.5 2 2.5 3 3.5 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

N=10

Pro

babi

lity

−0.5 0 0.5 1 1.5 2 2.5 3 3.5 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

N=50

Pro

babi

lity

5

We want to consider more complicated situations, built out of

simple cases (like Gaussian or Wishart) by doing operations like

• taking the sum of two matrices

• taking the product of two matrices

• taking corners of matrices

6

Note: If different N × N random matrices A and B are involved

then the eigenvalue distribution of non-trivial functions f(A, B)

(like A+B or AB) will of course depend on the relation between

the eigenspaces of A and of B.

However: It turns out there is a deterministic and treatable result

if

• the eigenspaces are in ”generic” position and

• if N → ∞

This is the realm of free probability theory.

7

Consider N × N random matrices A and C such that

• A has an asymptotic eigenvalue distribution for N → ∞ and

C has an asymptotic eigenvalue distribution for N → ∞

• A and C are independent (i.e., entries of A are independent

from entries of C)

8

Then eigenspaces of A and of C might still be in special relation

(e.g., both A and C could be diagonal).

However, consider now

A and B := UCU∗,

where U is Haar unitary N × N random matrix.

Then, eigenspaces of A and of B are in ”generic” position and

the asymptotic eigenvalue distribution of A+B depends only on

the asymptotic eigenvalue distribution of A and the asymptotic

eigenvalue distribution of B (which is the same as the one of C).

9

We can expect that the asymptotic eigenvalue distribution of

f(A, B) depends only on the asymptotic eigenvalue distribution

of A and the asymptotic eigenvalue distribution of B if

• A and B are independent

• one of them is unitarily invariant

(i.e., the joint distribution of the entries does not change

under unitary conjugation)

Note: Gaussian and Wishart random matrices are unitarily in-

variant

10

Thus: the asymptotic eigenvalue distribution of

• the sum of random matrices in generic position

A + UCU∗

• the product of random matrices in generic position

AUCU∗

• corners of unitarily invariant matrices UCU∗

should only depend on the asymptotic eigenvalue distribution of

A and of C.

11

Example: sum of independent Gaussian and Wishart (M = 2N)

random matrices, averaged over 10000 trials

−3 −2 −1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=5−3 −2 −1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=50

12

Example: product of two independent Wishart (M = 5N) ran-

dom matrices, averaged over 10000 trials

0 0.5 1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

N=50 0.5 1 1.5 2 2.5 3 3.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

N=50

13

Example: upper left corner of size N/2 × N/2 of a randomly

rotated N × N projection matrix,

with half of the eigenvalues 0 and half of the eigenvalues 1,

averaged over 10000 trials

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.5

1

1.5

N=80.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0

0.5

1

1.5

N=3214

Problems:

• Do we have a conceptual way of understanding

these asymptotic eigenvalue distributions?

• Is there an algorithm for actually calculating these

asymptotic eigenvalue distributions?

15

How do we analyze the eigenvalue distributions?

eigenvalue distributionof matrix A

=̂

knowledge oftraces of powers,

tr(Ak)

1

N

(

λk1 + · · · + λk

N

)

= tr(Ak)

averaged eigenvaluedistribution of

random matrix A=̂

knowledge ofexpectations of

traces of powers,

E[tr(Ak)]

16

Stieltjes inversion formula. If one knows the asymptotic mo-

ments

αk := limN→∞

E[tr(Ak)]

of a random matrix A, then one can get its asymptotic eigenvalue

distribution µ as follows:

Form Cauchy (or Stieltjes) transform

G(z) :=∞∑

k=0

αk

zk+1

Then:

dµ(t) = −1

πlimε→0

ℑG(t + iε)

17

Consider random matrices A and B in generic position.

We want to understand A + B, i.e., for all k ∈ N

E[

tr(

(A + B)k)]

.

But

E[tr((A+B)6)] = E[tr(A6)]+· · ·+E[tr(ABAABA)]+· · ·+E[tr(B6)],

thus we need to understand

mixed moments in A and B

18

Use following notation:

ϕ(A) := limN→∞

E[tr(A)].

Question: If A and B are in generic position, can we understand

ϕ (An1Bm1An2Bm2 · · · )

in terms of

(

ϕ(Ak))

k∈Nand

(

ϕ(Bk))

k∈N

19

Example: Consider two Gaussian random matrices A, B which

are independent (and thus in generic position).

Then the asymptotic mixed moments in A and B

ϕ (An1Bm1An2Bm2 · · · )

are given by

#

{

non-crossing/planar pairings of the pattern

A · A · · ·A︸︷︷︸

n1-times

·B · B · · ·B︸︷︷︸

m1-times

·A · A · · ·A︸︷︷︸

n2-times

·B · B · · ·B︸︷︷︸

m2-times

· · · ,

which do not pair A with B

}

20

Example: ϕ(AABBABBA) = 2 since there are two such non-

crossing pairings:

A A B B A B B A

and

A A B B A B B A

Note: each of the pairings connects at least one of the groups

An1, Bm1, An2, . . . only among itself!

and thus:

ϕ

((

A2−ϕ(A2)1)(

B2−ϕ(B2)1)(

A−ϕ(A)1)(

B2−ϕ(B2)1)(

A−ϕ(A)1))

= 0

21

In general we have

ϕ((

An1 −ϕ(An1) ·1)·(Bm1 −ϕ(Bm1) ·1

)·(An2 −ϕ(An2) ·1

)· · ·

)

= #

{

non-crossing pairings which do not pair A with B,

and for which each group is connected with some other group

}

=0

22

Actual equation for the calculation of the mixed moments

ϕ1 (An1Bm1An2Bm2 · · · )

is different for different random matrix ensembles.

However, the relation between the mixed moments,

ϕ

((

An1 − ϕ(An1) · 1)

·(

Bm1 − ϕ(Bm1) · 1)

· · ·

)

= 0

remains the same for matrix ensembles in generic position and

constitutes the definition of freeness.

23

Definition [Voiculescu 1985]: A and B are free (with respect

to ϕ) if we have for all n1, m1, n2, · · · ≥ 1 that

ϕ

((

An1−ϕ(An1) ·1)

·(

Bm1−ϕ(Bm1) ·1)

·(

An2−ϕ(An2) ·1)

· · ·

)

= 0

ϕ

((

Bn1−ϕ(Bn1)·1)

·(

Am1−ϕ(Am1)·1)

·(

Bn2−ϕ(Bn2)·1)

· · ·

)

= 0

ϕ

(

alternating product in centered words in A and in B

)

= 0

24

Note: freeness is a rule for calculating mixed moments in A and

B from the moments of A and the moments of B.

Example:

ϕ

((

An − ϕ(An)1)(

Bm − ϕ(Bm)1))

= 0,

thus

ϕ(AnBm)−ϕ(An·1)ϕ(Bm)−ϕ(An)ϕ(1·Bm)+ϕ(An)ϕ(Bm)ϕ(1·1) = 0,

and hence

ϕ(AnBm) = ϕ(An) · ϕ(Bm).

25

Freeness is a rule for calculating mixed moments, analo-

gous to the concept of independence for random variables.

Thus freeness is also called free independence

Note: free independence is a different rule from classical indepen-

dence; free independence occurs typically for non-commuting

random variables.

Example:

ϕ

((

A − ϕ(A)1)

·(

B − ϕ(B)1)

·(

A − ϕ(A)1)

·(

B − ϕ(B)1))

= 0,

which results in

ϕ(ABAB) = ϕ(AA) · ϕ(B) · ϕ(B) + ϕ(A) · ϕ(A) · ϕ(BB)

− ϕ(A) · ϕ(B) · ϕ(A) · ϕ(B)

26

Consider A, B free.

Then, by freeness, the moments of A+B are uniquely determined

by the moments of A and the moments of B.

Notation: We say the distribution of A + B is the

free convolution

of the distribution of A and the distribution of B,

µA+B = µA ⊞ µB.

27

In principle, freeness determines this, but the concrete nature of

this rule is not clear.

Examples: We have

ϕ(

(A + B)1)

= ϕ(A) + ϕ(B)

ϕ(

(A + B)2)

= ϕ(A2) + 2ϕ(A)ϕ(B) + ϕ(B2)

ϕ(

(A + B)3)

= ϕ(A3) + 3ϕ(A2)ϕ(B) + 3ϕ(A)ϕ(B2) + ϕ(B3)

ϕ(

(A + B)4)

= ϕ(A4) + 4ϕ(A3)ϕ(B) + 4ϕ(A2)ϕ(B2)

+ 2(

ϕ(A2)ϕ(B)ϕ(B) + ϕ(A)ϕ(A)ϕ(B2)

− ϕ(A)ϕ(B)ϕ(A)ϕ(B))

+ 4ϕ(A)ϕ(B3) + ϕ(B4)

28

To treat these formulas in general, linearize the free convolution

by going over from moments (ϕ(Am))m∈N to free cumulants

(κm)m∈N.

Those are defined by relations like:

ϕ(A1) = κ1

ϕ(A2) = κ2 + κ21

ϕ(A3) = κ3 + 3κ1κ2 + κ31

ϕ(A4) = κ4 + 4κ1κ3 + 2κ22 + 6κ2

1κ2 + κ41

...

29

There is a combinatorial structure behind these formulas, the

sums are running over non-crossing partitions:

ϕ(A1) = ϕ(A2) = +

= κ1 = κ2 + κ1κ1

ϕ(A3) = + + + +

= κ3 + κ1κ2 + κ2κ1 + κ2κ1 + κ1κ1κ1

ϕ(A4) = + + + + + +

+ + + + + + +

= κ4 + 4κ1κ3 + 2κ22 + 6κ2

1κ2 + κ41

30

This combinatorial relation between moments (ϕ(Am))m∈N and

cumulants (κm)m∈N can be translated into generating power se-

ries.

Put

G(z) =1

z+

∞∑

m=1

ϕ(Am)

zm+1Cauchy transform

and

R(z) =∞∑

m=1

κmzm−1R-transform

Then we have the relation

1

G(z)+ R(G(z)) = z.

31

Theorem [Voiculescu 1986, Speicher 1994]:

Let A and B be free. Then one has

RA+B(z) = RA(z) + RB(z),

or equivalently

κA+Bm = κA

m + κBm ∀m.

32

This, together with the relation between Cauchy transform and

R-transform and with the Stieltjes inversion formula, gives an

effective algorithm for calculating free convolutions, i.e., sums

of random matrices in generic position.

A GA RA

↓

RA +RB = RA+B GA+B A + B

↑

B GB RB

33

Example: Wigner + Wishart (M = 2N), trials = 4000

−3 −2 −1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

N=10034

One has similar analytic description for product.

Theorem [Voiculescu 1987, Haagerup 1997, Nica +

Speicher 1997]:

Put

MA(z) :=∞∑

m=0

ϕ(Am)zm

and define

SA(z) :=1 + z

zM<−1>

A (z) S-transform of A

Then: If A and B are free, we have

SAB(z) = SA(z) · SB(z).

35

Example: Wishart x Wishart (M = 5N), trials=1000

0.5 1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

N=10036

upper left corner of size N/2 × N/2 of a projection matrix,

with N/2 eigenvalues 0 and N/2 eigenvalues 1; trials=5000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.2

0.4

0.6

0.8

1

N=6437

• Free Calculator by Raj Rao and Alan Edelman

• A. Nica and R. Speicher: Lectures on the Combinatorics of

Free Probability.

To appear soon in the London Mathematical Society Lecture

Note Series, vol. 335, Cambridge University Press

38

Outlook on other talks around free probability

• Anshelevich: ”free” orthogonal and Meixner polynomials

• Burda: free random Levy matrices

• Chatterjee: concentration of measures and free probability

• Demni: free stochastic processes

• Kargin: large deviations in free probability

• Mingo + Speicher: fluctuations of random matrices

• Rashidi Far: operator-valued free probability theory and block

matrices

39

Free Probability Theory and Random Matricesweb.mit.edu/sea06/agenda/talks/Speicher_survey.pdf ·...

Documents

Transcript of Free Probability Theory and Random Matricesweb.mit.edu/sea06/agenda/talks/Speicher_survey.pdf ·...