Download - Applications of free probability and random matrix theoryCMA 2007 Applications of free probability and random matrix theory Some important concepts from classical probability Random

CMA 2007

Applications of free probability and random matrix

theory

Øyvind Ryan

December 2007

Øyvind Ryan Applications of free probability and random matrix theory

CMA 2007 Applications of free probability and random matrix theory

Some important concepts from classical probability

Random variables are functions (i.e. they commute w.r.t.multiplication) with a given p.d.f. (denoted f )

Expectation (denoted E ) is integration

Independence

Additive convolution (∗) and the logarithm of the Fouriertransform

Multiplicative convolution

Central limit law, with special role of the Gaussian law

Poisson distribution Pc : The limit of((

1 − cn

)

δ(0) + cnδ(1)

)∗n

as n → ∞.

Divisibility: For a given a, find i.i.d. b1, ..., bn such thatfa = fb1+···+bn

.



Can we find a more general theory, where the random variablesare matrices (or more generally, operators), with theireigenvalue distribution (or spectrum) taking the role as thep.d.f.?

What are the analogues to the above mentioned concepts forthis theory?

What are the applications of such a theory?



Free probability

Free probability was developed as a probability theory for randomvariables which do not commute, like matrices

The random variables are elements in a unital ∗-algebra(denoted A), typically B(H), or Mn(C).

Expectation (denoted φ) is a normalized linear functional on A.The pair (A, φ) is called a noncommutative probability space.

For matrices, φ will be the normalized trace trn, defined by

trn(a) =1

n

n∑

i=1

aii .

For random matrices, we set φ(a) = τn(a) = E (trn(a)) isdefined by.



What is the "central limit" for large matrices?

We will attempt to make a connection with classical probabilitythrough large random matrices. We would like to define randommatrices as "independent" if all entries in one are independent fromall entries in the other.Assume that X1, ...,Xm are n × n i.i.d. complex matrices, andτn(Xi) = 0, τn(X

2

i ) = 1. What is the limit when m → ∞ in

X1 + · · · + Xm√m

?

If Xi = 1√nYi where Yi has i.i.d. complex standard Gaussian

entries, thenX1 + · · · + Xm√

m∼ X ,

where X = 1√nY and Y has i.i.d. complex standard Gaussian

entries. Therefore, matrices with complex standard Gaussian entriesare central limit candidates.



The full circle law

What happens when n is large? The eigenvalues converge to whatis called the full circle law. Here for n = 500.

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

plot(eig( (1/sqrt(1000)) * (randn(500,500) +

j*randn(500,500)) ),’kx’)



The semicircle law

−3 −2 −1 0 1 2 30

5

10

15

20

25

30

35

A = (1/sqrt(2000)) * (randn(1000,1000) +

j*randn(1000,1000));

A = (sqrt(2)/2)*(A+A’);

hist(eig(A),40)



The Marchenko Pastur law

What happens with the eigenvalues of 1

NXXH when X is an n × N

random matrix with standard complex Gaussian entries?

The eigenvalue distribution converges to the MarchenkoPastur law with parameter n

N, denoted µ n

N.

Let f µc be the p.d.f. of µc . Then

f µc (x) = (1 − 1

c)+δ(x) +

√

(x − a)+(b − x)+

2πcx, (1)

where (z)+ = max(0, z), a = (1 −√c)+ and a = (1 +

√c)+.

The matrices 1

NXXH occur most frequently as sample

covariance matrices: N is the number of observations, and n isthe number of parameters in the system.



Four different Marchenko Pastur laws µ nN

are drawn.

0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Den

sity

c=0.5c=0.2c=0.1c=0.05



Derivation of the limiting distribution for 1√NXX

H

When x is standard complex Gaussian, we have that

E(

|x |2p)

= p!.

A more general statement concerns a random matrix 1√NXX

H ,

where X is an n × N random matrix with independent standardcomplex Gaussian entries. It is known [HT] that

τn

((

1√N

XXH

)p)

=1

Npn

∑

π∈Sp

Nk(π)nl(π),

where π is a permutation in S2p constructed in a certain way fromπ, and k(π), l(π) are functions taking values in {0, 1, 2, ...}.



One can show that this equals

τn

((

1√N

XXH

)p)

=∑

π∈NC2p

1 +∑

k

ak

N2k.

The convergence is "almost sure", which means that we have veryaccurate eigenvalue prediction when the matrices are large.



Motivation for free probability

One can show that for the Gaussian random matrices weconsidered, the limits

φ(

Ai1B j1 · · ·Ail B jl)

= limn→∞

trn(

Ai1n B j1

n · · ·AilnB jl

n

)

exist. If we linearly extend the linear functional φ to all polynomialsin A and B , the following can be shown:

Theorem

If Pi ,Qi are polynomials in A and B respectively, with 1 ≤ i ≤ l ,

and φ(Pi (A)) = 0, φ(Qi(B)) = 0 for all i , then

φ (P1(A)Q1(B) · · ·Pl(A)Ql (B)) = 0.

This motivates the definition of freeness, which is the analogy toindependence.



Definition of freeness

Definition

A family of unital ∗-subalgebras (Ai )i∈I is called a free family if

aj ∈ Aij

i1 6= i2, i2 6= i3, · · · , in−1 6= inφ(a1) = φ(a2) = · · · = φ(an) = 0

⇒ φ(a1 · · · an) = 0. (2)

A family of random variables ai is called a free family if the algebrasthey generate form a free family.



The free central limit theorem

Theorem

If

a1, ..., an are free and self-adjoint,

φ(ai ) = 0,

φ(a2

i = 1,

supi |φ(aki )| < ∞ for all k,

then the sequence (a1 + · · · + an)/√

n converges in distribution to

the semicircle law.

In free probability, the semicircle law thus has the role of theGaussian law. it’s density is density 1

2π

√4 − x2



Similarities between classical and free probability

1 Additive convolution ⊞: The p.d.f. of the sum of free randomvariables. The role of the logarithm of the Fourier transform isnow taken by the R-transform, which satisfiesRµa⊞µb

(z) = Rµa(z) + Rµb(z).

2 The S-transform: Transform on probability distributions whichsatisfies Sµa⊠µb

(z) = Sµa(z)Sµb(z)

3 Poisson distributions have their analogy in the free Poissondistributions: These are given by the Marcenko Pastur laws µc

with parameter c , which also can be written as the limit of((

1 − cn

)

δ(0) + cnδ(1)

)⊞nas n → ∞

4 Infinite divisibility: There exists an analogy to the Lévy-Hinčinformula for infinite divisibility in classical probability.



Main usage of free probability in my papers

Let A and B be random matrices. How can we make a goodprediction of the eigenvalue distribution of A when one has theeigenvalue distribution of A + B and B? Simplest case is whenone assumes that B is Gaussian (Noise). What about theeigenvectors?

Assume that we have the eigenvalue distribution of1

N(R + X )(R + X )H , where R and X are n × N random

matrices, with X Gaussian. If the columns of R arerealizations of some random vector r , what is the covariancematrix E (ri r

∗j )?

Have use for multiplicative free convolution with theMarchenko Pastur law. This has an efficient implementation.



Channel capacity estimation

The following is a much used observation model in MIMO systems:

Hi =1√n

(H + σXi ) (3)

where

n is the number of receiving and transmitting antennas,

Hi is the n × n measured MIMO matrix,

H is the n × n MIMO channel and

Xi is the n × n noise matrix with i.i.d zero mean unit varianceGaussian entries.



Channel capacity estimation

With free probability we can estimate the eigenvalues of 1

nHH

H

based on few observations Hi . This helps us estimate the channel

capacity:The capacity of a channel with channel matrix H and signal tonoise ratio ρ = 1

σ2 is given by

C =1

nlog det

(

I +1

nσ2HH

H

)

(4)

=1

n

n∑

l=1

log(1 +1

σ2λl) (5)

where λl are the eigenvalues of 1

nHH

H .



Observation model

Form the compound observation matrix

H1...L =1√L

[

H1, H2, ..., HL

]

,

from the observations

Hi =1√n

(H + σXi ) , (6)

Using free probability, one can with high accuracy estimate theeigenvalues of 1

nHH

H from the eigenvalues of H1...LHH1...L.



Free capacity estimation for channel matrices of various rank

0 10 20 30 40 50 60 70 80 90 1001.5

2

2.5

3

3.5

4

4.5

Number of observations

Cap

acity

True capacity, rank 3C

f, rank 3


f, rank 5


f, rank 6

Figure: The free probability based estimator for various number of

observations. σ2 = 0.1 and n = 10. The rank of H was 3, 5 and 6.



Application areas

digital communications,

nuclear physics,

mathematical finance

Situations in these fields, can often be modelled with randommatrices. When the matrices get large, free probability theory is aninvaluable tool for describing the asymptotic behaviour of manysystems.Other types of matrices which are of interest are random unitarymatrices and random Vandermonde matrices.



List of papers

Free Deconvolution for Signal Processing Applications.Submitted to IEEE Trans. Inform. Theory.arxiv.org/cs.IT/0701025.Multiplicative free Convolution and Information-Plus-NoiseType Matrices. Submitted to Ann. Appl. Probab.arxiv.org/math.PR/0702342.Channel Capacity Estimation using Free Probability Theory.Submitted to IEEE. Trans. Signal Process.arxiv.org/abs/0707.3095.Random Vandermonde Matrices-Part I: Fundamental results.Work in progress.Random Vandermonde Matrices-Part II: Applications towireless applications. Work in progress.Applications of free probability in finance. Estimation of thecovariance matrix itself (not only it’s eigenvalue distribution).2008.



References

[HT]: "Random Matrices and K-theory for Exact C ∗-algebras". U.Haagerup and S. Thorbjørnsen. citeseer.ist.psu.edu/114210.html.1998.This talk is available athttp://heim.ifi.uio.no/∼oyvindry/talks.shtml.My publications are listed athttp://heim.ifi.uio.no/∼oyvindry/publications.shtml