Information Complexity Lower Bounds

Post on 19-Feb-2016

61 views 2 download

Tags:

description

Information Complexity Lower Bounds. Rotem Oshman, Princeton CCI Based on: Bar-Yossef,Jayram,Kumar,Srinivasan’04 Braverman,Barak,Chen,Rao’10. Communication Complexity. = ?. Yao ‘79, “Some complexity questions related to distributive computing ”. Communication Complexity. - PowerPoint PPT Presentation

Transcript of Information Complexity Lower Bounds

Information Complexity Lower Bounds

Rotem Oshman, Princeton CCIBased on:

Bar-Yossef,Jayram,Kumar,Srinivasan’04 Braverman,Barak,Chen,Rao’10

Communication Complexity

𝑋 𝑌

= ?

Yao ‘79, “Some complexity questions related to distributive computing”

• Applications:– Circuit complexity– Streaming algorithms– Data structures– Distributed computing– Property testing– …

Communication Complexity

Deterministic Protocols

• A protocol specifies, at each point:– Which player speaks next– What should the player say– When to halt and what to output

• Formally,

what we’ve said so far

who speaks next: Alice, Bob, = halt

what to say/output

Randomized Protocols

• Can use randomness to decide what to say– Private randomness: each player has a separate

source of random bits– Public randomness: both players can use the same

random bits• Goal: for any compute correctly with

probability • Communication complexity: worst-case length

of transcript in any execution

Randomness Can Help a Lot

• Example: EQUALITY– Input: – Output: is ?

• Trivial protocol: Alice sends to Bob• For deterministic protocols, this is optimal!

EQUALITY Lower Bound

0𝑛 1𝑛…0𝑛

1𝑛

111111111

#rectangles

Randomized Protocol

• Protocol with public randomness:– Select random – Alice sends – Bob accepts iff

• If : always accept• If :

• Reject with probability

non-zero vector

Set Disjointness

• Input: • Output: ?• Theorem [Kalyanasundaran, Schnitger ‘92,

Razborov ‘92]: randomized CC = – Easy to see for deterministic protocols

• Today we’ll see a proof by Bar-Yossef, Jayram, Kumar, Srinivasan ‘04

Application: Streaming Lower Bounds

• Streaming algorithm:

• Example: how many distinct items in the data?• Reduction from Disjointness [Alon, Matias,

Szegedy ’99]

data

algorithmHow much spaceis required to approximate f(data)?

Reduction from Disjointness:

• Fix a streaming algorithm for Distinct Elements with space , universe size

• Construct a protocol for Disj. with elements:

𝑋={𝑥1 ,…,𝑥𝑘 } 𝑌={𝑦 1 ,…, 𝑦 ℓ }

algorithmState of the algorithm and (#bits = )

⇔ #distinct elements in is

Application 2: KW Games

• Circuit depth lower bounds:

• How deep does the circuit need to be?

∧∨ ∧

∨∨∧ ∨𝑥1 … 𝑥𝑛

𝑓 (𝑥¿¿1 ,…,𝑥𝑛)¿

Application 2: KW Games

• Karchmer-Wigderson’93,Karchmer-Raz-Wigderson’94:

𝑋 : 𝑓 ( 𝑋 )=0 𝑌 : 𝑓 (𝑌 )=1

find such that

Application 2: KW Games

• Claim: if has deterministic CC , then requires circuit depth .

• Circuit with depth protocol with length

∧∨ ∧

∨∨∧ ∨𝑥1 … 𝑥𝑛𝑋 : 𝑓 ( 𝑋 )=0 𝑌 : 𝑓 (𝑌 )=1

0 1

1 10 1

Information-Theoretic Lower Bound on Set Disjointness

Some Basic Concepts from Info Theory

• Entropy of a random variable:

• Important properties:

– is deterministic– = expected # bits needed to encode

Some Basic Concepts from Info Theory

• Conditional entropy: • Important properties:

– are independent • Example:–– If then , if 1 then

Some Basic Concepts from Info Theory

• Mutual information:

• Conditional mutual information:

• Important properties:

– are independent

Some Basic Concepts from Info Theory

• Chain rule for mutual information:

• More generally,

Information Cost of Protocols

• Fix an input distribution on • Given a protocol , let also denote the

distribution of ’s transcript• Information cost of :

• Information cost of a function :

Information Cost of Protocols

• Important property: • Proof: by induction. Let .• : what we know after r rounds

what we knew after r-1 rounds

what we learn in round r, given what we already know

Information vs. Communication

• Want: • Suppose is sent by Alice.• What does Alice learn?– is a function of and so

• What does Bob learn?

Information vs. Communication

• Important property: • Lower bound on information cost ⇒ lower

bound on communication complexity• In fact, IC lower bounds are the most powerful

technique we know

Information Complexity of Disj.

• Disjointness: is ?• Strategy: for some “hard distribution” ,

1. Direct sum: 2. Prove that .

Hard Distribution for Disjointness

• For each coordinate :

𝑋 𝑖=0 𝑋 𝑖=1

𝑌 𝑖=0

𝑌 𝑖=1

1/3

1/3

1/3

0

𝑋

𝑌

𝐼 𝐶𝜇𝑛 (Disj )≥𝑛⋅ 𝐼𝐶𝜇 (¿)• Let be a protocol for on • Construct for as follows:– Alice and Bob get inputs – Choose a random coordinate , set – Sample and run – For each ,

𝑈

𝑉

𝐼 𝐶𝜇𝑛 (Disj )≥𝑛⋅ 𝐼𝐶𝜇 (¿)• Let be a protocol for on • Construct for as follows:– Alice and Bob get inputs – Choose a random coordinate , set – Bad idea: publicly sample

𝑈

𝑉

𝑋

𝑌Suppose in , Alice sends .

In , Bob learns one bit in he should learn bitBut if is public Bob learns 1 bit about !

𝐼 𝐶𝜇𝑛 (Disj )≥𝑛⋅ 𝐼𝐶𝜇 (¿)• Let be a protocol for on • Construct for as follows:– Alice and Bob get inputs – Choose a random coordinate , set – Another bad idea: publicly sample , Bob privately samples

given – But the players can’t sample , independently…

𝐼 𝐶𝜇𝑛 (Disj )≥𝑛⋅ 𝐼𝐶𝜇 (¿)• Let be a protocol for on • Construct for as follows:– Alice and Bob get inputs – Choose a random coordinate , set

𝑈

𝑉

Publicly sample

Publicly sample

Privately sample

Privately sample

𝑋

𝑌

Direct Sum Theorem• Transcript of • Need to show:

Information Complexity of Disj.

• Disjointness: is ?• Strategy: for some “hard distribution” ,

1. Direct sum: 2. Prove that .

Hardness of AND

1101

0 0 10

1/3

1/3

0

1/3

¿0

¿0

transcript on should be“very different”

Hellinger Distance

• Examples:

– If have disjoint support,

Hellinger Distance

• Hellinger distance is a metric– , with equality iff

– Triangle inequality:

𝑃𝑄

𝑅

Hellinger Distance

• If for some we have then

1101

0 0 10

h≥ 23√2

Hellinger Distance vs. Mutual Info

• Let be two distributions• Select by choosing , then drawing • Then

𝐼 (Π ;𝑌|𝑋=0 )≥h2 (Π 00 , Π 01 )

1101

0 0 10

1/3

1/3

0

1/3𝐼 (Π ; 𝑋|𝑌=0 )≥h2 (Π 00 , Π 10 )

Hardness of AND

1101

0 0 10

1/3

1/3

0

1/3

h≥ 23√2

Same for Alice untilBob acts differently

Same for Bob untilAlice acts differently

“Cut-n-Paste Lemma”

• Recall: • Enough to show: we can write

“Cut-n-Paste Lemma”

• We can write

• Proof:– induces a distribution on “partial transcripts” of

each length : probability that first bits are – By induction:

• Base case: – Set

“Cut-n-Paste Lemma”

• Step: • Suppose after it is Alice’s turn to speak• What Alice says depends on:– Her input– Her private randomness– The transcript so far,

• So • Set

Hardness of AND

1101

0 0 10

1/3

1/3

0

1/3

h≥ 23√2

Multi-Player Communication Complexity

The Coordinator Model

sites

𝑓 (𝑋 1,… ,𝑋𝑘 )=?

bits

𝑋 1 𝑋 2 𝑋𝑘…

Multi-Party Set Disjointness

• Input: • Output: is ?• Braverman,Ellen,O.,Pitassi,Vaikuntanathan’13:

lower bound of bits

Reduction from DISJ tograph connectivity

• Given we want to– Choose vertices – Design inputs such that is connected iff

Reduction from DISJ tograph connectivity

1234

56

𝑝1

𝑝2

𝑝𝑘

(Players)

𝑋 𝑖

[𝑛 ]∖⋃𝑋 𝑖

(Elements)

input graph connected

Other Stuff

• Distributed computing

Other Stuff

• Compressing down to information cost• Number-on-forehead lower bounds• Open questions in communication complexity