1
Introduction to information complexity
June 30, 2013
Mark BravermanPrinceton University
Part I: Information theory
• Information theory, in its modern format was introduced in the 1940s to study the problem of transmitting data over physical channels.
2
communication channel
Alice Bob
Quantifying “information”
• Information is measured in bits.• The basic notion is Shannon’s entropy. • The entropy of a random variable is the
(typical) number of bits needed to remove the uncertainty of the variable.
• For a discrete variable:
3
Shannon’s entropy• Important examples and properties:
– If is a constant, then – If is uniform on a finite set of possible values,
then .– If is supported on at most values, then .– If is a random variable determined by , then .
4
Conditional entropy• For two (potentially correlated) variables ,
the conditional entropy of given is the amount of uncertainty left in given :
.• One can show .• This important fact is known as the chain
rule. • If , then
5
Example
• Where .• Then
– ; ; ;
6
Mutual information
7
𝐻 (𝑋 ) 𝐻 (𝑌 )𝐻 (𝑌∨𝑋 )𝐻 (𝑋∨𝑌 )
𝐵1𝐵1⊕𝐵2
𝐵2⊕𝐵3
𝐵4
𝐵5
𝐼 (𝑋 ;𝑌 )
Mutual information• The mutual information is defined as
• “By how much does knowing reduce the entropy of ?”
• Always non-negative .• Conditional mutual information:
• Chain rule for mutual information:
• Simple intuitive interpretation. 8
Information Theory• The reason Information Theory is so
important for communication is because information-theoretic quantities readily operationalize.
• Can attach operational meaning to Shannon’s entropy: “the cost of transmitting ”.
• Let be the (expected) cost of transmitting a sample of .
9
?
• Not quite. • Let trit • .
• It is always the case that .
10
1 02 10
3 11
But and are close
• Huffman’s coding: • This is a compression result: “an
uninformative message turned into a short one”.
• Therefore: .
11
Shannon’s noiseless coding• The cost of communicating many copies of
scales as . • Shannon’s source coding theorem:
– Let be the cost of transmitting independent copies of . Then the amortized transmission cost
.• This equation gives operational
meaning. 12
communication channel
𝐻 ( 𝑋 )𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑎𝑙𝑖𝑧𝑒𝑑
13
𝑋 1 ,… ,𝑋𝑛 ,… per copy to transmit ’s
is nicer than
• is additive for independent variables. • Let be independent trits. • .
• Works well with concepts such as channel capacity.
14
Operationalizing other quantities
• Conditional entropy • (cf. Slepian-Wolf Theorem).
communication channel
𝑋 1 ,… ,𝑋𝑛 ,… per copy to transmit ’s
𝑌 1 ,… ,𝑌 𝑛 ,…
communication channel
Operationalizing other quantities
• Mutual information :
𝑋 1 ,… ,𝑋𝑛 ,… per copy to sample ’s
𝑌 1 ,… ,𝑌 𝑛 ,…
Information theory and entropy
• Allows us to formalize intuitive notions. • Operationalized in the context of one-way
transmission and related problems. • Has nice properties (additivity, chain rule…)• Next, we discuss extensions to more
interesting communication scenarios.
17
Communication complexity• Focus on the two party randomized setting.
18
A B
X YA & B implement a functionality .
F(X,Y)
e.g.
Shared randomness R
Communication complexity
A B
X Y
Goal: implement a functionality .A protocol computing :
F(X,Y)
m1(X,R)m2(Y,m1,R)
m3(X,m1,m2,R)
Communication cost = #of bits exchanged.
Shared randomness R
Communication complexity• Numerous applications/potential
applications.• Considerably more difficult to obtain lower
bounds than transmission (still much easier than other models of computation!).
20
Communication complexity
• (Distributional) communication complexity with input distribution and error : Error w.r.t. .
• (Randomized/worst-case) communication complexity: . Error on all inputs.
• Yao’s minimax:.
21
Examples
• Equality .
• .
22
Equality• is .• is a distribution where w.p. and w.p. are
random.
A B
X Y
• Shows that
MD5(X) [128 bits]X=Y? [1 bit]
Error?
Examples
• I. • .In fact, using information complexity:• .
24
Information complexity
• Information complexity :: communication complexity
as• Shannon’s entropy ::
transmission cost
25
Information complexity
• The smallest amount of information Alice and Bob need to exchange to solve .
• How is information measured?• Communication cost of a protocol?
– Number of bits exchanged. • Information cost of a protocol?
– Amount of information revealed.
26
Basic definition 1: The information cost of a protocol
• Prior distribution: .
A B
X Y
Protocol πProtocol transcript
𝐼𝐶(𝜋 ,𝜇)= 𝐼 (Π ;𝑌∨𝑋 )+𝐼 (Π ; 𝑋∨𝑌 )what Alice learns about Y + what Bob learns about X
Example• is .• is a distribution where w.p. and w.p. are
random.
A B
X Y
1 + 65 = 66 bits
what Alice learns about Y + what Bob learns about X
MD5(X) [128 bits]X=Y? [1 bit]
Prior matters a lot for information cost!
• If a singleton,
29
Example• is .• is a distribution where are just uniformly
random.
A B
X Y
0 + 128 = 128 bits
what Alice learns about Y + what Bob learns about X
MD5(X) [128 bits]X=Y? [1 bit]
Basic definition 2: Information complexity
• Communication complexity:.
• Analogously:.
31
Needed!
Prior-free information complexity
• Using minimax can get rid of the prior. • For communication, we had:
.• For information
.
32
Operationalizing IC: Information equals amortized communication
• Recall [Shannon]: .• Turns out [B.-Rao’11]: , for . [Error allowed on each copy]• For : .•[ an interesting open problem.]
33
Entropy vs. Information Complexity
Entropy IC
Additive? Yes Yes
Operationalized
Compression? Huffman: ???!
Can interactive communication be compressed?
• Is it true that ?• Less ambitiously:
• (Almost) equivalently: Given a protocol with , can Alice and Bob simulate using communication?
• Not known in general…
35
Applications
• Information = amortized communication means that to understand the amortized cost of a problem enough to understand its information complexity.
36
Example: the disjointness function• , are subsets of • Alice gets , Bob gets .• Need to determine whether .• In binary notation need to compute
• An operator on copies of the 2-bit AND function.
37
Set intersection
• , are subsets of • Alice gets , Bob gets .• Want to compute .• This is just copies of the 2-bit AND. • Understanding the information complexity
of AND gives tight bounds on both problems!
38
Exact communication bounds[B.-Garg-Pankratov-Weinstein’13]
• (trivial). • [Kalyanasundaram-Schnitger’87,
Razborov’92]New:• .
39
Small set disjointness• , are subsets of , • Alice gets , Bob gets .• Need to determine whether .• Trivial: .• [Hastad-Wigderson’07]:• [BGPW’13]: .
40
Open problem: Computability of IC
• Given the truth table of , and , compute • Via can compute a sequence of upper
bounds.• But the rate of convergence as a function of
is unknown.
41
Open problem: Computability of IC
• Can compute the -round information complexity of .
• But the rate of convergence as a function of is unknown.
• Conjecture:
• This is the relationship for the two-bit AND.
42
43
Thank You!
Top Related