ONR MURI: NexGeNetSci Optimizing Information Flow in Networks Third Year Review, October 29, 2010...
-
date post
19-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of ONR MURI: NexGeNetSci Optimizing Information Flow in Networks Third Year Review, October 29, 2010...
ONR MURI: NexGeNetSci
Optimizing Information Flow in Networks
Third Year Review, October 29, 2010
Babak HassibiDepartment of Electrical
EngineeringCalifornia Institute of Technology
Theory DataAnalysis
Numerical Experiments
LabExperiments
FieldExercises
Real-WorldOperations
• First principles• Rigorous math• Algorithms• Proofs
• Correct statistics
• Only as good as underlying data
• Simulation• Synthetic,
clean data
• Stylized• Controlled• Clean,
real-world data
• Semi-Controlled
• Messy, real-world data
• Unpredictable• After action
reports in lieu of data
HassibiOptimizing information flow
Overview of Work Done
1. Network information theory• wired and wireless networks, entropic vectors, Stam vectors,
groups, matroids, Ingleton inequality, Cayley’s hyperdeterminant, entropy power inequality
2. Estimation and control over lossy networks• asymptotic analysis of random matrix recursions, universal
laws for networks
3. Social network problems• searchibility and distance-dependent Kronecker graphs• many-to-one matchings over social networks
4. Distributed adaptive consensus
Network Information Theory
• Network information theory deals studies the limits of information flow in networks. Unlike point-to-point problems (solved by Shannon in 1948) almost all network information theory problems are open.
ONR MURI: NexGeNetSci
relay
transmitter receiver
P(y2|x)
P(y1|x)
s1
s2
x
y1
y2
transmitter
P(y1|x1,x2)
P(y2|x1,x2)
y1
y2
x1
x2
ONR MURI: NexGeNetSci
So How are Current Networks Operated?
Almost invariably via a two-step procedure:
• use coding to make each link error free (up to the Shannon capacity)• view information as a regular “flow” and solve a flow problem over the network (routing, etc.)
But is information a regular flow?
ONR MURI: NexGeNetSci
Is Information a Regular Flow?
• routing delivers 1.5 bits per receiver• taking delivers 2 bits per receiverbax
ONR MURI: NexGeNetSci
A General Network Problem
Network
Suppose each source wants to communicate with its corresponding destination at rate
The problem with the above formulation is that it is infinite letter, and that for each T it is a highly non-convex optimization problem (both in the input distributions and the “network operations”).
( ), _1 1
1lim sup ( ( ) ( ) ( , ))
Ti
m mT T T T
i i i i iT p S netwk opsi i
R H S H X H S XT
iR
1s
2s
ms
1x
2x
mx
Entropy Vectors
• Consider n discrete random variables of alphabet size N and define the entropy of as
• This defines a dimensional vector called an entropy vector
• The space of all entropic vectors is denoted by and can be shown to be a convex cone
•
nxxx ,...,, 21},...,2,1{ nS
);( SiXHh iS
12 n
*n
*n
ONR MURI: NexGeNetSci
A Convex Formulation of the Network Problem
Associate a random variable with each edge of the
network. The sum rate capacity of the network is given by
1
max ( ( ) ( ) ( , ))m
i i i ii
h s h x h s x
subject to and*
nh• (for sources)• (at each node)• (at each edge)
1 2 1 2( , ,... ) ( ) ( ) ... ( )m mh s s s h s h s h s ( , ) ( )in out inh X X h X( )i ih X c
inX outX
ONR MURI: NexGeNetSci
Remarks • Network information theory is basically the problem of
identifying which is open for n>3.
• The following issues need to be addressed:
- given a vector in , is it entropic?
- given an entropic vector, realize it.
- can these by done in a distributed way?
• The framework results in an explosion in the number of variables
- is this really necessary?
*n
2 1n
R
Stam Vectors and Wireless Networks
• Consider n continuous random vectors of dimension N and define the Stam entropy of as
• This defines a dimensional vector call a Stam vector • The space of all Stam vectors is denoted by and can be
shown to be a compact convex set
• Main Result: Network information theory problems for wireless networks can be cast as linear optimization over
• For n=2, is characterized by the entropy power inequality
nxxx ,...,, 21},...,2,1{ nS
)(1
Si
iS XHN
h
12 n
*n
*n
*n
)(2
1)(
2
1)(
2
12121
222XhXhXXh
ONR MURI: NexGeNetSci
Some Related ObjectsEntropy vectors are related to:• Quasi-uniform distributions
- typical sequences• Finite groups
- statistical mechanics, symmetric group
- random walk over entropy vectors• Matroids
- representibility of matroids
- optimal linear networks codes via linear programming• Determinantal inequalities
- Cayley’s hyperdeterminant
Entropy and Groups
• Given a finite group and n subgroups the dimensional vector with entries
where is entropic• Conversely, any entropic vector for some collection of n random
variables, corresponds to some finite group and n of its subgroups• Abelian groups are not sufficient to characterize all entropic
vectors---they satisfy a so-called Ingleton inequality, which entropic vectors can violate– this is why linear network coding is not sufficient---linear codes form an
Abelian group
GnGGG ,...,, 21
12 n
},...,2,1{ nS ||
||log
iSi
S G
Gg
ONR MURI: NexGeNetSci
Codes from Non-Abelian Groups
ab ba
a bIf a and b are chosen from a non-Abelian group, one may be able to infer them from ab and ba.
a b
ba2
aba2ba
There is also a larger set of signals that one may transmit.
ONR MURI: NexGeNetSci
Where is this Coming From? Ans: Stat Mech
• Suppose we have T particles that can be in one of N states with probability
• Then the typical micro-states will be those for which
• The entropy is the log of the number of microstates
• One can think of the numerator as the size of the symmetric group of T elements and the denominator as the size of a certain subgroup of .
, 1, 2,...,ip i N
i iT Tp
1 2
!log
! !... !N
Th
T T T
TS
TS
ONR MURI: NexGeNetSci
Entropy and Partitions
1
2
3
1’
2’
3’
1 2 33, 4, 2T T T
1
9!log log1260 10.3
3!4!2!h bits
1' 2' 3'4, 2, 3T T T
2
9!log log1260 10.3
4!2!3!h bits
11' 21' 22' 23' 33'3, 1, 2, 1, 2T T T T T
12
9!log log15120 13.9
3!1!2!1!2!h bits
ONR MURI: NexGeNetSci
Staking out the Entropy Region
• Take a set of size T and for each random variable partition it into N sets
• The entropies and joint entropies can be computed from the partitions and their intersections
• By making local changes to the partitions, we can move from one entropy vector to the next
• As T and N grow, one can stake out the entire entropic region to desired accuracy
• This idea can be used to perform random walks on entropy vectors and thereby MCMC methods for entropy optimization
ONR MURI: NexGeNetSci
ij ik il jk jl kl ijk ijl i jI h h h h h h h h h h
I<0 is the Ingleton bound. The above is a MCMC maximization of I for T=100 and N=2. The resulting value of .025 is much superior to previous best known violation 0.007.
ONR MURI: NexGeNetSci
Optimizing the Information Flow in Networks
This optimization can be done in networks, provided we respect the network topology.
The sum rate can be optimized in a distributed fashion:• each output edge randomly changes its partition based
on information received by the sinks
1 1,G P
2 2,G P
3 1 2G G G
3 1 2P P P
ONR MURI: NexGeNetSci
The Vamos Network
• Constructed from the Vamos matroid, the smallest nonrepresentable matroid• Capacity unknown. Known to be less than 60/11.
ONR MURI: NexGeNetSci
Using the distributed MCMC method, we can find a binary solution with sum rate 5.
ONR MURI: NexGeNetSci
The Non-Pappus Network1
4
5
7,8,9
78
9
w
c
ba
z
xy
a b c
6
3t
u2
7a8 b9 c
A 9 element, rank 3 nonrepresentablematroid. Capacity of network unknown.
ONR MURI: NexGeNetSci
The Group PGL(2,p)
• We have performed computer search to find the smallest finite group that violates the Ingleton inequality
• It is the projective linear group PGL(2,5) with 120 elements
• The groups PGL(n,p) and GL(n,p) can be used to construct codes stronger than linear network codes
Entropy and Matroids• A matroid is a set of objects along with a rank function that satisfies
submodularity• Entropy satisfies submodularity and therefore defines a matroid
• However, not all matroids are entropic• A matroid is called representable if it can be represented by a collection
of vectors over some (finite) field. • All representable matroids are entropic, but not all entropic matroids are
representable• When an entropic matroid is representable, the corresponding network
problem has an optimal solution which is a linear network code (over the finite field which represents the matroid)
)()()()( BHAHBAHBAH
ONR MURI: NexGeNetSci
The Fano Matroid
The Fano matroid has a representation only over GF(2)
11
11
10
10
01
11
100
010
001
7
gfedcba
A
a
b c
d e
f
g
ONR MURI: NexGeNetSci
The Non-Fano Matroid
The Non-Fano matroid has a representation over every field except GF(2)
11
11
10
10
01
11
100
010
001
7
gfedcba
A
a
b c
d e
f
g
ONR MURI: NexGeNetSci
A Network with no Linear Solution
• i
• wants c• want b• wants a
• d • f
• e• g
• wants c• wants b• wants a
• a • b • c
• This network has no linear coding solution with capacity 7• The linear coding capacity can be shown to be 70/11 < 7• A nonlinear network code of capacity 7 can be found
h jk
ONR MURI: NexGeNetSci
Matroid Representations• Unfortunately, determining whether a general matroid is
representable is a classical open problem in matroid theory
• However, the question of whether a matroid is binary representable has a relatively simple answer– the matroid must have no U(2,4) minor, i.e., no 4-
element minor such that all pairs are independent and all triples dependent---see matrix below
• Similar results hold for ternary and quaternary representability, but that is about it.
?
?
1
1
1
0
0
1
ONR MURI: NexGeNetSci
Binary Entropic Vectors
A vector in is the entropy vector of n linearly-related binary random variables iff:
1. it has integer entries
2.
3. it satisfies submodularity
4. for every and every
the 15-dimensional entropy
vector corresponding to is not
U(2,4).
2 1n
R
| |Sh S
, , , {1,2,..., }i j k l n{1,2,..., } { , , , }S n i j k l
{ , , , | }i j k l SX X X X X
ONR MURI: NexGeNetSci
Optimal Linear Binary Network Codes
The sum rate capacity of a network over the class of linear binary network codes is given by
1
max ( ( ) ( ) ( , ))n
i i i ii
h s h x h s x
subject to , the poly-matroidal cone, andnh
• (for sources)• (at each node)• (at each edge)• for every and every the 15-dimensional entropy vector corresponding to is in the convex cone of the entropy region of 4 binary rvs.
1 2 1 2( , ,... ) ( ) ( ) ... ( )m mh s s s h s h s h s ( , ) ( )in out inh X X h X( )i ih X c
, , , {1,2,..., }i j k l n{1,2,..., } { , , , }S n i j k l
{ , , , | }i j k l SX X X X X
ONR MURI: NexGeNetSci
Comments
• We have reduced the problem of optimal linear binary network coding to linear programming
• In general, the complexity of the linear program is exponential
- variables; submodular inequalities
minors to consider• However, if we define r = no. of sources, then
- we only have variables
- we only have minors to consider
- we could significantly fewer submodular inequalities
2 1n 3( 1)2nn n n 4 72 / 3nn n
2( )rO n
2( )rO n
ONR MURI: NexGeNetSci
Estimation and Control over Lossy Networks
• There is a great deal of recent interest in estimation and control over lossy networks.
• While in many cases (especially in estimation) determining the optimal algorithms is straightforward, determining the performance (stability, mean-square-error, etc.) can be quite challenging (see, e.g., Sinopoli et al).
• The main reason is that the system performance is governed by a random matrix Riccati recursion, which is incredibly difficult to analyze.
ONR MURI: NexGeNetSci
Large System Analysis
• When the dynamical system being estimated or controlled has a large state space dimension, we have proposed a method of analysis based on large random matrix theory.
• The contention is that when the system dimension and the network are large, the performance of the system exhibits universal laws that depend only on the macroscopic properties of the system and network.
• The main tool for the analysis is the Stieltjes transform of a random matrix A
s(z) = E ( trace (zI – A)^{-1} )/n
From which the marginal eigendistribution of A can be found via
p(λ) = lim Im( s(λ+jϵ) )/2πϵ→0+
ONR MURI: NexGeNetSci
Example
• Consider a MIMO linear time-invariant system with random output measurements that are randomly dropped across some lossy network
Consensus
• What is Consensus?– Given a network where nodes have different values,
update over time to converge on a single value– In many cases, we would like convergence to the
sample average– Simple local averaging often works
• Motivation– Sensor network application– Synchronizing distributed agents
• Agree on one value to apply to all agents
Distributed Adaptive Consensus• How to quickly reach consensus in a network is important in many
applications• Local weighted averaging often allows consensus across a network
– for example, Metropolis weighting (which requires only the knowledge of the degree of one’s node) works
• However, if global knowledge of the network topology is available, optimal weights (to minimize the consensus time) can be found using semi-definite programming
• However, the semi-definite program cannot be made distributed, since the sub-gradient of the second largest eigenvalue requires global knowledge
• We have developed an algorithm that simultaneously updates both the local averages and the weights using only local information– it computes the gradient of a certain quadratic cost
• It can be shown to reach consensus faster than Metropolis weighting
Social Network Problems
1. Kronecker graphs recently introduced by Leskovec• while having many nice properties, these graphs are not
searchable• we have extended these graphs to “distance-dependent”
Kronecker graphs which are searchable
2. Many-to-one matching problems over social networks• prove the existence of pairwise stable matchings• develop greedy algorithms to achieve such matchings• characterize price-of-anarchy of such algorithms
Overview of Work Done
1. Network information theory• wired and wireless networks, entropic vectors, Stam vectors,
groups, matroids, Ingleton inequality, Cayley’s hyperdeterminant, entropy power inequality
2. Estimation and control over lossy networks• asymptotic analysis of random matrix recursions, universal
laws for networks
3. Social network problems• searchibility and distance-dependent Kronecker graphs• many-to-one matchings over social networks
4. Distributed adaptive consensus