ONR MURI: NexGeNetSci Optimizing Information Flow in Networks Third Year Review, October 29, 2010...

ONR MURI: NexGeNetSci

Optimizing Information Flow in Networks

Third Year Review, October 29, 2010

Babak HassibiDepartment of Electrical

EngineeringCalifornia Institute of Technology

Theory DataAnalysis

Numerical Experiments

LabExperiments

FieldExercises

Real-WorldOperations

• First principles• Rigorous math• Algorithms• Proofs

• Correct statistics

• Only as good as underlying data

• Simulation• Synthetic,

clean data

• Stylized• Controlled• Clean,

real-world data

• Semi-Controlled

• Messy, real-world data

• Unpredictable• After action

reports in lieu of data

HassibiOptimizing information flow

Overview of Work Done

1. Network information theory• wired and wireless networks, entropic vectors, Stam vectors,

groups, matroids, Ingleton inequality, Cayley’s hyperdeterminant, entropy power inequality

2. Estimation and control over lossy networks• asymptotic analysis of random matrix recursions, universal

laws for networks

3. Social network problems• searchibility and distance-dependent Kronecker graphs• many-to-one matchings over social networks

4. Distributed adaptive consensus

Network Information Theory

• Network information theory deals studies the limits of information flow in networks. Unlike point-to-point problems (solved by Shannon in 1948) almost all network information theory problems are open.


relay

transmitter receiver

P(y2|x)

P(y1|x)

s1

s2

x

y1

y2

transmitter

P(y1|x1,x2)

P(y2|x1,x2)

y1

y2

x1

x2


So How are Current Networks Operated?

Almost invariably via a two-step procedure:

• use coding to make each link error free (up to the Shannon capacity)• view information as a regular “flow” and solve a flow problem over the network (routing, etc.)

But is information a regular flow?


Is Information a Regular Flow?

• routing delivers 1.5 bits per receiver• taking delivers 2 bits per receiverbax


A General Network Problem

Network

Suppose each source wants to communicate with its corresponding destination at rate

The problem with the above formulation is that it is infinite letter, and that for each T it is a highly non-convex optimization problem (both in the input distributions and the “network operations”).

( ), _1 1

1lim sup ( ( ) ( ) ( , ))

Ti

m mT T T T

i i i i iT p S netwk opsi i

R H S H X H S XT

iR

1s

2s

ms

1x

2x

mx

Entropy Vectors

• Consider n discrete random variables of alphabet size N and define the entropy of as

• This defines a dimensional vector called an entropy vector

• The space of all entropic vectors is denoted by and can be shown to be a convex cone

•

nxxx ,...,, 21},...,2,1{ nS

);( SiXHh iS

12 n

*n

*n


A Convex Formulation of the Network Problem

Associate a random variable with each edge of the

network. The sum rate capacity of the network is given by

1

max ( ( ) ( ) ( , ))m

i i i ii

h s h x h s x

subject to and*

nh• (for sources)• (at each node)• (at each edge)

1 2 1 2( , ,... ) ( ) ( ) ... ( )m mh s s s h s h s h s ( , ) ( )in out inh X X h X( )i ih X c

inX outX


Remarks • Network information theory is basically the problem of

identifying which is open for n>3.

• The following issues need to be addressed:

- given a vector in , is it entropic?

- given an entropic vector, realize it.

- can these by done in a distributed way?

• The framework results in an explosion in the number of variables

- is this really necessary?

*n

2 1n

R

Stam Vectors and Wireless Networks

• Consider n continuous random vectors of dimension N and define the Stam entropy of as

• This defines a dimensional vector call a Stam vector • The space of all Stam vectors is denoted by and can be

shown to be a compact convex set

• Main Result: Network information theory problems for wireless networks can be cast as linear optimization over

• For n=2, is characterized by the entropy power inequality

nxxx ,...,, 21},...,2,1{ nS

)(1

Si

iS XHN

h

12 n

*n

*n

*n

)(2

1)(

2

1)(

2

12121

222XhXhXXh


Some Related ObjectsEntropy vectors are related to:• Quasi-uniform distributions

- typical sequences• Finite groups

- statistical mechanics, symmetric group

- random walk over entropy vectors• Matroids

- representibility of matroids

- optimal linear networks codes via linear programming• Determinantal inequalities

- Cayley’s hyperdeterminant

Entropy and Groups

• Given a finite group and n subgroups the dimensional vector with entries

where is entropic• Conversely, any entropic vector for some collection of n random

variables, corresponds to some finite group and n of its subgroups• Abelian groups are not sufficient to characterize all entropic

vectors---they satisfy a so-called Ingleton inequality, which entropic vectors can violate– this is why linear network coding is not sufficient---linear codes form an

Abelian group

GnGGG ,...,, 21

12 n

},...,2,1{ nS ||

||log

iSi

S G

Gg


Codes from Non-Abelian Groups

ab ba

a bIf a and b are chosen from a non-Abelian group, one may be able to infer them from ab and ba.

a b

ba2

aba2ba

There is also a larger set of signals that one may transmit.


Where is this Coming From? Ans: Stat Mech

• Suppose we have T particles that can be in one of N states with probability

• Then the typical micro-states will be those for which

• The entropy is the log of the number of microstates

• One can think of the numerator as the size of the symmetric group of T elements and the denominator as the size of a certain subgroup of .

, 1, 2,...,ip i N

i iT Tp

1 2

!log

! !... !N

Th

T T T

TS

TS


Entropy and Partitions

1

2

3

1’

2’

3’

1 2 33, 4, 2T T T

1

9!log log1260 10.3

3!4!2!h bits

1' 2' 3'4, 2, 3T T T

2

9!log log1260 10.3

4!2!3!h bits

11' 21' 22' 23' 33'3, 1, 2, 1, 2T T T T T

12

9!log log15120 13.9

3!1!2!1!2!h bits


Staking out the Entropy Region

• Take a set of size T and for each random variable partition it into N sets

• The entropies and joint entropies can be computed from the partitions and their intersections

• By making local changes to the partitions, we can move from one entropy vector to the next

• As T and N grow, one can stake out the entire entropic region to desired accuracy

• This idea can be used to perform random walks on entropy vectors and thereby MCMC methods for entropy optimization


ij ik il jk jl kl ijk ijl i jI h h h h h h h h h h

I<0 is the Ingleton bound. The above is a MCMC maximization of I for T=100 and N=2. The resulting value of .025 is much superior to previous best known violation 0.007.


Optimizing the Information Flow in Networks

This optimization can be done in networks, provided we respect the network topology.

The sum rate can be optimized in a distributed fashion:• each output edge randomly changes its partition based

on information received by the sinks

1 1,G P

2 2,G P

3 1 2G G G

3 1 2P P P


The Vamos Network

• Constructed from the Vamos matroid, the smallest nonrepresentable matroid• Capacity unknown. Known to be less than 60/11.


Using the distributed MCMC method, we can find a binary solution with sum rate 5.


The Non-Pappus Network1

4

5

7,8,9

78

9

w

c

ba

z

xy

a b c

6

3t

u2

7a8 b9 c

A 9 element, rank 3 nonrepresentablematroid. Capacity of network unknown.


N=2, T=100, C=2/3


N=3, T=100, C=0.8228


The Group PGL(2,p)

• We have performed computer search to find the smallest finite group that violates the Ingleton inequality

• It is the projective linear group PGL(2,5) with 120 elements

• The groups PGL(n,p) and GL(n,p) can be used to construct codes stronger than linear network codes

Entropy and Matroids• A matroid is a set of objects along with a rank function that satisfies

submodularity• Entropy satisfies submodularity and therefore defines a matroid

• However, not all matroids are entropic• A matroid is called representable if it can be represented by a collection

of vectors over some (finite) field. • All representable matroids are entropic, but not all entropic matroids are

representable• When an entropic matroid is representable, the corresponding network

problem has an optimal solution which is a linear network code (over the finite field which represents the matroid)

)()()()( BHAHBAHBAH


The Fano Matroid

The Fano matroid has a representation only over GF(2)

11

11

10

10

01

11

100

010

001

7

gfedcba

A

a

b c

d e

f

g


The Non-Fano Matroid

The Non-Fano matroid has a representation over every field except GF(2)

11

11

10

10

01

11

100

010

001

7

gfedcba

A

a

b c

d e

f

g


A Network with no Linear Solution

• i

• wants c• want b• wants a

• d • f

• e• g

• wants c• wants b• wants a

• a • b • c

• This network has no linear coding solution with capacity 7• The linear coding capacity can be shown to be 70/11 < 7• A nonlinear network code of capacity 7 can be found

h jk


Matroid Representations• Unfortunately, determining whether a general matroid is

representable is a classical open problem in matroid theory

• However, the question of whether a matroid is binary representable has a relatively simple answer– the matroid must have no U(2,4) minor, i.e., no 4-

element minor such that all pairs are independent and all triples dependent---see matrix below

• Similar results hold for ternary and quaternary representability, but that is about it.

?

?

1

1

1

0

0

1


Binary Entropic Vectors

A vector in is the entropy vector of n linearly-related binary random variables iff:

1. it has integer entries

2.

3. it satisfies submodularity

4. for every and every

the 15-dimensional entropy

vector corresponding to is not

U(2,4).

2 1n

R

| |Sh S

, , , {1,2,..., }i j k l n{1,2,..., } { , , , }S n i j k l

{ , , , | }i j k l SX X X X X


Optimal Linear Binary Network Codes

The sum rate capacity of a network over the class of linear binary network codes is given by

1

max ( ( ) ( ) ( , ))n

i i i ii

h s h x h s x

subject to , the poly-matroidal cone, andnh

• (for sources)• (at each node)• (at each edge)• for every and every the 15-dimensional entropy vector corresponding to is in the convex cone of the entropy region of 4 binary rvs.

1 2 1 2( , ,... ) ( ) ( ) ... ( )m mh s s s h s h s h s ( , ) ( )in out inh X X h X( )i ih X c

, , , {1,2,..., }i j k l n{1,2,..., } { , , , }S n i j k l

{ , , , | }i j k l SX X X X X


Comments

• We have reduced the problem of optimal linear binary network coding to linear programming

• In general, the complexity of the linear program is exponential

- variables; submodular inequalities

minors to consider• However, if we define r = no. of sources, then

- we only have variables

- we only have minors to consider

- we could significantly fewer submodular inequalities

2 1n 3( 1)2nn n n 4 72 / 3nn n

2( )rO n

2( )rO n


Estimation and Control over Lossy Networks

• There is a great deal of recent interest in estimation and control over lossy networks.

• While in many cases (especially in estimation) determining the optimal algorithms is straightforward, determining the performance (stability, mean-square-error, etc.) can be quite challenging (see, e.g., Sinopoli et al).

• The main reason is that the system performance is governed by a random matrix Riccati recursion, which is incredibly difficult to analyze.


Large System Analysis

• When the dynamical system being estimated or controlled has a large state space dimension, we have proposed a method of analysis based on large random matrix theory.

• The contention is that when the system dimension and the network are large, the performance of the system exhibits universal laws that depend only on the macroscopic properties of the system and network.

• The main tool for the analysis is the Stieltjes transform of a random matrix A

s(z) = E ( trace (zI – A)^{-1} )/n

From which the marginal eigendistribution of A can be found via

p(λ) = lim Im( s(λ+jϵ) )/2πϵ→0+


Example

• Consider a MIMO linear time-invariant system with random output measurements that are randomly dropped across some lossy network

Consensus

• What is Consensus?– Given a network where nodes have different values,

update over time to converge on a single value– In many cases, we would like convergence to the

sample average– Simple local averaging often works

• Motivation– Sensor network application– Synchronizing distributed agents

• Agree on one value to apply to all agents

Distributed Adaptive Consensus• How to quickly reach consensus in a network is important in many

applications• Local weighted averaging often allows consensus across a network

– for example, Metropolis weighting (which requires only the knowledge of the degree of one’s node) works

• However, if global knowledge of the network topology is available, optimal weights (to minimize the consensus time) can be found using semi-definite programming

• However, the semi-definite program cannot be made distributed, since the sub-gradient of the second largest eigenvalue requires global knowledge

• We have developed an algorithm that simultaneously updates both the local averages and the weights using only local information– it computes the gradient of a certain quadratic cost

• It can be shown to reach consensus faster than Metropolis weighting

Social Network Problems

1. Kronecker graphs recently introduced by Leskovec• while having many nice properties, these graphs are not

searchable• we have extended these graphs to “distance-dependent”

Kronecker graphs which are searchable

2. Many-to-one matching problems over social networks• prove the existence of pairwise stable matchings• develop greedy algorithms to achieve such matchings• characterize price-of-anarchy of such algorithms

Overview of Work Done

1. Network information theory• wired and wireless networks, entropic vectors, Stam vectors,

groups, matroids, Ingleton inequality, Cayley’s hyperdeterminant, entropy power inequality

2. Estimation and control over lossy networks• asymptotic analysis of random matrix recursions, universal

laws for networks

3. Social network problems• searchibility and distance-dependent Kronecker graphs• many-to-one matchings over social networks

4. Distributed adaptive consensus

ONR MURI: NexGeNetSci Optimizing Information Flow in Networks Third Year Review, October 29, 2010...

Documents

Transcript of ONR MURI: NexGeNetSci Optimizing Information Flow in Networks Third Year Review, October 29, 2010...