Csr2011 june15 09_30_shen

165
. . . . . . CSR-2011 Kolmogorov complexity as a language Alexander Shen LIF CNRS, Marseille; on leave from ИППИ РАН, Москва
  • date post

    21-Oct-2014
  • Category

    Documents

  • view

    434
  • download

    1

description

 

Transcript of Csr2011 june15 09_30_shen

Page 1: Csr2011 june15 09_30_shen

. . . . . .

CSR-2011

Kolmogorov complexity as a language

Alexander ShenLIF CNRS, Marseille; on leave from

ИППИ РАН, Москва

Page 2: Csr2011 june15 09_30_shen

. . . . . .

Kolmogorov complexity

I A powerful toolI Just a way to reformulate argumentsI three languages: combinatorial/algorithmic/probabilistic

Page 3: Csr2011 june15 09_30_shen

. . . . . .

Kolmogorov complexity

I A powerful tool

I Just a way to reformulate argumentsI three languages: combinatorial/algorithmic/probabilistic

Page 4: Csr2011 june15 09_30_shen

. . . . . .

Kolmogorov complexity

I A powerful toolI Just a way to reformulate arguments

I three languages: combinatorial/algorithmic/probabilistic

Page 5: Csr2011 june15 09_30_shen

. . . . . .

Kolmogorov complexity

I A powerful toolI Just a way to reformulate argumentsI three languages: combinatorial/algorithmic/probabilistic

Page 6: Csr2011 june15 09_30_shen

. . . . . .

Kolmogorov complexity

I A powerful toolI Just a way to reformulate argumentsI three languages: combinatorial/algorithmic/probabilistic

Page 7: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notation

I K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 8: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces x

I KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 9: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = x

I depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 10: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter D

I optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 11: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive term

I Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 12: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 13: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 14: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.

There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 15: Csr2011 june15 09_30_shen

. . . . . .

Reminder and notationI K(x) = minimal length of a program that produces xI KD(x) = min|p| : D(p) = xI depends on the interpreter DI optimal D makes it minimal up to O(1) additive termI Variations:

p = string p = prefix of a sequencex = string plain prefix

K(x), C(x) KP(x), K(x)x = prefix decision monotone

of a sequence KR(x), KD(x) KM(x), Km(x)

Conditional complexity C(x|y):minimal length of a program p : y 7→ x.There is also a priori probability (in two versions: discrete, on strings;continuous, on prefixes)

Page 16: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 17: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?

I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 18: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 19: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 20: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)

I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 21: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 22: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 23: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 24: Csr2011 june15 09_30_shen

. . . . . .

Foundations of probability theory

I Random object or random process?I “well shuffled deck of cards”: any meaning?

[xkcd cartoon]

I randomness = incompressibility (maximal complexity)I ω = ω1ω2 . . . is random iff KM(ω1 . . . ωn) ≥ n−O(1)

I Classical probability theory: random sequence satisfies theStrong Law of Large Numbers with probability 1

I Algorithmic version: every (algorithmically) randomsequence satisfies SLLN

I algorithmic ⇒ classical: Martin-Löf random sequencesform a set of measure 1.

Page 25: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform oneI effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the deviceI but are related under some assumptions

Page 26: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform oneI effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the deviceI but are related under some assumptions

Page 27: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform oneI effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the deviceI but are related under some assumptions

Page 28: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform one

I effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the deviceI but are related under some assumptions

Page 29: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform oneI effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the deviceI but are related under some assumptions

Page 30: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform oneI effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the device

I but are related under some assumptions

Page 31: Csr2011 june15 09_30_shen

. . . . . .

Sampling random strings (S.Aaronson)

I A device that (being switched on) produces N-bit string andstops

I “The device produces a random string”: what does itmean?

I classical: the output distribution is close to the uniform oneI effective: with high probability the output string isincompressible

I not equivalent if no assumptions about the deviceI but are related under some assumptions

Page 32: Csr2011 june15 09_30_shen

. . . . . .

Example: matrices without uniform minors

I k× k minor of n× n Boolean matrix: select k rows and kcolumns

I minor is uniform if it is all-0 or all-1.I claim: there is a n× n bit matrix without k× k uniformminors for k = 3 log n.

Page 33: Csr2011 june15 09_30_shen

. . . . . .

Example: matrices without uniform minors

I k× k minor of n× n Boolean matrix: select k rows and kcolumns

I minor is uniform if it is all-0 or all-1.I claim: there is a n× n bit matrix without k× k uniformminors for k = 3 log n.

Page 34: Csr2011 june15 09_30_shen

. . . . . .

Example: matrices without uniform minors

I k× k minor of n× n Boolean matrix: select k rows and kcolumns

I minor is uniform if it is all-0 or all-1.I claim: there is a n× n bit matrix without k× k uniformminors for k = 3 log n.

Page 35: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 36: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]

I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 37: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)

I 2n2−k2 possibilities for the rest

I n2k × 2× 2n2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n

2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 38: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the rest

I n2k × 2× 2n2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n

2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 39: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 =

2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 40: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 41: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in total

I one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 42: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)

I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 43: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrix

I 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 44: Csr2011 june15 09_30_shen

. . . . . .

Counting argument and complexity reformulation

I ≤ nk × nk positions of the minor [k = 3 logn]I 2 types of uniform minors (0/1)I 2n

2−k2 possibilities for the restI n2k × 2× 2n

2−k2 = 2log n×2×3 log n+1+(n2−9 log2 n) < 2n2 .

I log n bits to specify a column or row: 2k log n bits in totalI one additional bit to specify the type of minor (0/1)I n2 − k2 bits to specify the rest of the matrixI 2k log n+ 1+ (n2 − k2) = 6 log2 n+ 1+ (n2 − 9 log2 n) < n2.

Page 45: Csr2011 june15 09_30_shen

. . . . . .

One-tape Turing machines

I copying n-bit string on 1-tape TM requires Ω(n2) time

I complexity version: if initially the tape was empty on theright of the border, then after n steps the complexity of azone that is d cells far from the border is O(n/d).

K(u(t)) ≤ O(n/d)I proof: border guards in each cell of the border security zone write down the

contents of the head of TM; each of the records is enough to reconstruct u(t) so

the length of it should be Ω(K(u(t)); the sum of lengths does not exceed time

Page 46: Csr2011 june15 09_30_shen

. . . . . .

One-tape Turing machinesI copying n-bit string on 1-tape TM requires Ω(n2) time

I complexity version: if initially the tape was empty on theright of the border, then after n steps the complexity of azone that is d cells far from the border is O(n/d).

K(u(t)) ≤ O(n/d)I proof: border guards in each cell of the border security zone write down the

contents of the head of TM; each of the records is enough to reconstruct u(t) so

the length of it should be Ω(K(u(t)); the sum of lengths does not exceed time

Page 47: Csr2011 june15 09_30_shen

. . . . . .

One-tape Turing machinesI copying n-bit string on 1-tape TM requires Ω(n2) time

I complexity version: if initially the tape was empty on theright of the border, then after n steps the complexity of azone that is d cells far from the border is O(n/d).

K(u(t)) ≤ O(n/d)

I proof: border guards in each cell of the border security zone write down the

contents of the head of TM; each of the records is enough to reconstruct u(t) so

the length of it should be Ω(K(u(t)); the sum of lengths does not exceed time

Page 48: Csr2011 june15 09_30_shen

. . . . . .

One-tape Turing machinesI copying n-bit string on 1-tape TM requires Ω(n2) time

I complexity version: if initially the tape was empty on theright of the border, then after n steps the complexity of azone that is d cells far from the border is O(n/d).

K(u(t)) ≤ O(n/d)I proof: border guards in each cell of the border security zone write down the

contents of the head of TM; each of the records is enough to reconstruct u(t) so

the length of it should be Ω(K(u(t)); the sum of lengths does not exceed time

Page 49: Csr2011 june15 09_30_shen

. . . . . .

Everywhere complex sequences

I Random sequence has n-bit prefix of complexity nI but some factors (substrings) have small complexityI Levin: there exist everywhere complex sequences: everyn-bit substring has complexity 0.99n−O(1)

I Combinatorial equivalent: Let F be a set of strings that hasat most 20.99n strings of length n. Then there is a sequenceω s.t. all sufficiently long substrings of ω are not in F.

I combinatorial and complexity proofs not just translations ofeach other (Lovasz lemma, Rumyantsev, Miller, Muchnik)

Page 50: Csr2011 june15 09_30_shen

. . . . . .

Everywhere complex sequences

I Random sequence has n-bit prefix of complexity n

I but some factors (substrings) have small complexityI Levin: there exist everywhere complex sequences: everyn-bit substring has complexity 0.99n−O(1)

I Combinatorial equivalent: Let F be a set of strings that hasat most 20.99n strings of length n. Then there is a sequenceω s.t. all sufficiently long substrings of ω are not in F.

I combinatorial and complexity proofs not just translations ofeach other (Lovasz lemma, Rumyantsev, Miller, Muchnik)

Page 51: Csr2011 june15 09_30_shen

. . . . . .

Everywhere complex sequences

I Random sequence has n-bit prefix of complexity nI but some factors (substrings) have small complexity

I Levin: there exist everywhere complex sequences: everyn-bit substring has complexity 0.99n−O(1)

I Combinatorial equivalent: Let F be a set of strings that hasat most 20.99n strings of length n. Then there is a sequenceω s.t. all sufficiently long substrings of ω are not in F.

I combinatorial and complexity proofs not just translations ofeach other (Lovasz lemma, Rumyantsev, Miller, Muchnik)

Page 52: Csr2011 june15 09_30_shen

. . . . . .

Everywhere complex sequences

I Random sequence has n-bit prefix of complexity nI but some factors (substrings) have small complexityI Levin: there exist everywhere complex sequences: everyn-bit substring has complexity 0.99n−O(1)

I Combinatorial equivalent: Let F be a set of strings that hasat most 20.99n strings of length n. Then there is a sequenceω s.t. all sufficiently long substrings of ω are not in F.

I combinatorial and complexity proofs not just translations ofeach other (Lovasz lemma, Rumyantsev, Miller, Muchnik)

Page 53: Csr2011 june15 09_30_shen

. . . . . .

Everywhere complex sequences

I Random sequence has n-bit prefix of complexity nI but some factors (substrings) have small complexityI Levin: there exist everywhere complex sequences: everyn-bit substring has complexity 0.99n−O(1)

I Combinatorial equivalent: Let F be a set of strings that hasat most 20.99n strings of length n. Then there is a sequenceω s.t. all sufficiently long substrings of ω are not in F.

I combinatorial and complexity proofs not just translations ofeach other (Lovasz lemma, Rumyantsev, Miller, Muchnik)

Page 54: Csr2011 june15 09_30_shen

. . . . . .

Everywhere complex sequences

I Random sequence has n-bit prefix of complexity nI but some factors (substrings) have small complexityI Levin: there exist everywhere complex sequences: everyn-bit substring has complexity 0.99n−O(1)

I Combinatorial equivalent: Let F be a set of strings that hasat most 20.99n strings of length n. Then there is a sequenceω s.t. all sufficiently long substrings of ω are not in F.

I combinatorial and complexity proofs not just translations ofeach other (Lovasz lemma, Rumyantsev, Miller, Muchnik)

Page 55: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)I then < d/2 changed bits are harmlessI but bit insertion or deletions could beI general requirement: C(xi|xj) ≥ dI generalization of GV bound: d-separated family of size

Ω(2n−d)

Page 56: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)I then < d/2 changed bits are harmlessI but bit insertion or deletions could beI general requirement: C(xi|xj) ≥ dI generalization of GV bound: d-separated family of size

Ω(2n−d)

Page 57: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)

I then < d/2 changed bits are harmlessI but bit insertion or deletions could beI general requirement: C(xi|xj) ≥ dI generalization of GV bound: d-separated family of size

Ω(2n−d)

Page 58: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)I then < d/2 changed bits are harmless

I but bit insertion or deletions could beI general requirement: C(xi|xj) ≥ dI generalization of GV bound: d-separated family of size

Ω(2n−d)

Page 59: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)I then < d/2 changed bits are harmlessI but bit insertion or deletions could be

I general requirement: C(xi|xj) ≥ dI generalization of GV bound: d-separated family of size

Ω(2n−d)

Page 60: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)I then < d/2 changed bits are harmlessI but bit insertion or deletions could beI general requirement: C(xi|xj) ≥ d

I generalization of GV bound: d-separated family of sizeΩ(2n−d)

Page 61: Csr2011 june15 09_30_shen

. . . . . .

Gilbert-Varshamov complexity bound

I coding theory: how many n-bit strings x1, . . . , xk one canfind if Hamming distance between every two is at least d

I lower bound (Gilbert–Varshamov)I then < d/2 changed bits are harmlessI but bit insertion or deletions could beI general requirement: C(xi|xj) ≥ dI generalization of GV bound: d-separated family of size

Ω(2n−d)

Page 62: Csr2011 june15 09_30_shen

. . . . . .

Inequalities for complexities and combinatorialinterpretation

I C(x, y) ≤ C(x) + C(y|x) +O(log)

I C(x, y) ≥ C(x) + C(y|x) +O(log)I C(x, y) < k+ l ⇒ C(x) < k+O(log) or C(y|x) < l+O(log)I every set A of size < 2k+l can be split into two partsA = A1 ∪ A2 such that w(A1) ≤ 2k and h(A2) ≤ 2l

Page 63: Csr2011 june15 09_30_shen

. . . . . .

Inequalities for complexities and combinatorialinterpretation

I C(x, y) ≤ C(x) + C(y|x) +O(log)

I C(x, y) ≥ C(x) + C(y|x) +O(log)I C(x, y) < k+ l ⇒ C(x) < k+O(log) or C(y|x) < l+O(log)I every set A of size < 2k+l can be split into two partsA = A1 ∪ A2 such that w(A1) ≤ 2k and h(A2) ≤ 2l

Page 64: Csr2011 june15 09_30_shen

. . . . . .

Inequalities for complexities and combinatorialinterpretation

I C(x, y) ≤ C(x) + C(y|x) +O(log)

I C(x, y) ≥ C(x) + C(y|x) +O(log)I C(x, y) < k+ l ⇒ C(x) < k+O(log) or C(y|x) < l+O(log)I every set A of size < 2k+l can be split into two partsA = A1 ∪ A2 such that w(A1) ≤ 2k and h(A2) ≤ 2l

Page 65: Csr2011 june15 09_30_shen

. . . . . .

Inequalities for complexities and combinatorialinterpretation

I C(x, y) ≤ C(x) + C(y|x) +O(log)

I C(x, y) ≥ C(x) + C(y|x) +O(log)

I C(x, y) < k+ l ⇒ C(x) < k+O(log) or C(y|x) < l+O(log)I every set A of size < 2k+l can be split into two partsA = A1 ∪ A2 such that w(A1) ≤ 2k and h(A2) ≤ 2l

Page 66: Csr2011 june15 09_30_shen

. . . . . .

Inequalities for complexities and combinatorialinterpretation

I C(x, y) ≤ C(x) + C(y|x) +O(log)

I C(x, y) ≥ C(x) + C(y|x) +O(log)I C(x, y) < k+ l ⇒ C(x) < k+O(log) or C(y|x) < l+O(log)

I every set A of size < 2k+l can be split into two partsA = A1 ∪ A2 such that w(A1) ≤ 2k and h(A2) ≤ 2l

Page 67: Csr2011 june15 09_30_shen

. . . . . .

Inequalities for complexities and combinatorialinterpretation

I C(x, y) ≤ C(x) + C(y|x) +O(log)

I C(x, y) ≥ C(x) + C(y|x) +O(log)I C(x, y) < k+ l ⇒ C(x) < k+O(log) or C(y|x) < l+O(log)I every set A of size < 2k+l can be split into two partsA = A1 ∪ A2 such that w(A1) ≤ 2k and h(A2) ≤ 2l

Page 68: Csr2011 june15 09_30_shen

. . . . . .

One more inequality

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z)

I V2 ≤ S1 × S2 × S3

I Also for Shannon entropies; special case of Shearerlemma

Page 69: Csr2011 june15 09_30_shen

. . . . . .

One more inequality

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z)

I V2 ≤ S1 × S2 × S3

I Also for Shannon entropies; special case of Shearerlemma

Page 70: Csr2011 june15 09_30_shen

. . . . . .

One more inequality

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z)

I V2 ≤ S1 × S2 × S3

I Also for Shannon entropies; special case of Shearerlemma

Page 71: Csr2011 june15 09_30_shen

. . . . . .

One more inequality

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z)

I V2 ≤ S1 × S2 × S3

I Also for Shannon entropies; special case of Shearerlemma

Page 72: Csr2011 june15 09_30_shen

. . . . . .

Common information and graph minors

I mutual information: I(a : b) = C(a) + C(b)− C(a,b)

I common information:

I combinatorial: graph minors

can the graph be covered by 2δ minors of size 2α−δ × 2β−δ?

Page 73: Csr2011 june15 09_30_shen

. . . . . .

Common information and graph minors

I mutual information: I(a : b) = C(a) + C(b)− C(a,b)

I common information:

I combinatorial: graph minors

can the graph be covered by 2δ minors of size 2α−δ × 2β−δ?

Page 74: Csr2011 june15 09_30_shen

. . . . . .

Common information and graph minors

I mutual information: I(a : b) = C(a) + C(b)− C(a,b)

I common information:

I combinatorial: graph minors

can the graph be covered by 2δ minors of size 2α−δ × 2β−δ?

Page 75: Csr2011 june15 09_30_shen

. . . . . .

Common information and graph minors

I mutual information: I(a : b) = C(a) + C(b)− C(a,b)

I common information:

I combinatorial: graph minors

can the graph be covered by 2δ minors of size 2α−δ × 2β−δ?

Page 76: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform sets

I nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional versionI how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 77: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform setsI nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional versionI how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 78: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform setsI nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional versionI how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 79: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform setsI nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional version

I how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 80: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform setsI nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional versionI how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 81: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform setsI nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional versionI how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 82: Csr2011 june15 09_30_shen

. . . . . .

Almost uniform setsI nonuniformity= (maximal section)/(average section)

I Theorem: every set of N elements can be represented asunion of polylog(N) sets whose nonuniformity is polylog(N).

I multidimensional versionI how to construct parts using Kolmogorov complexity: takestrings with given complexity bounds

I so simple that it is not clear what is the combinatorialtranslation

I but combinatorial argument exists (and gives even astronger result)

Page 83: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pkI ξN: N independent trials of ξI Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicatedI Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 84: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pk

I ξN: N independent trials of ξI Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicatedI Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 85: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pkI ξN: N independent trials of ξ

I Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicatedI Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 86: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pkI ξN: N independent trials of ξI Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicatedI Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 87: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pkI ξN: N independent trials of ξI Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicatedI Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 88: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pkI ξN: N independent trials of ξI Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicated

I Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 89: Csr2011 june15 09_30_shen

. . . . . .

Shannon coding theorem

I ξ is a random variable; k values, probabilities p1, . . . , pkI ξN: N independent trials of ξI Shannon’s informal question: how many bits are needed toencode a “typical” value of ξN?

I Shannon’s answer: NH(ξ), where

H(ξ) = p1 log(1/p1) + . . .+ pn log(1/pn).

I formal statement is a bit complicatedI Complexity version: with high probablity the value of ξNhas complexity close to NH(ξ).

Page 90: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)I The same for entropy:

2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 91: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)

I The same for entropy:2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)

I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 92: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)I The same for entropy:

2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)

I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 93: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)I The same for entropy:

2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 94: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)I The same for entropy:

2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 95: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)I The same for entropy:

2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 96: Csr2011 june15 09_30_shen

. . . . . .

Complexity, entropy and group size

I 2C(x, y, z) ≤ C(x, y) + C(y, z) + C(x, z) +O(log)I The same for entropy:

2H(ξ, η, τ) ≤ H(ξ, η) + H(ξ, τ) + H(η, τ)I …and even for the sizes of subgroups U,V,W of somefinite group G: 2 log(|G|/|U ∩ V ∩W|) ≤log(|G|/|U ∩ V|) + log(|G|/|U ∩W|) + log(|G|/|V ∩W|).

I in all three cases inequalities are the same(Romashchenko, Chan, Yeung)

I some of them are quite strange:

I(a : b) ≤≤ I(a : b|c)+I(a : b|d)+I(c : d)+I(a : b|e)+I(a : e|b)+I(b : e|a)

I Related to Romashchenko’s theorem: if three last termsare zeros, one can extract common information from a,b,e.

Page 97: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 98: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two strings

I we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 99: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to b

I by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 100: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higher

I there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 101: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 102: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 103: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–Slepian

I similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 104: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 105: Csr2011 june15 09_30_shen

. . . . . .

Muchnik and Slepian–Wolf

I a,b: two stringsI we look for a program p that maps a to bI by definition C(p) is at least C(b|a) but could be higherI there exist p : a 7→ b that is simple relative to b, e.g., “mapeverything to b”

I Muchnik theorem: it is possible to combine these twoconditions: there exists p : a 7→ b such that C(p) ≈ C(b|a)and C(p|b) ≈ 0

I information theory analog: Wolf–SlepianI similar technique was developed by Fortnow and Laplante(randomness extractors)

I (Romashchenko, Musatov): how to use explicit extractorsand derandomization to get space-bounded versions

Page 106: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 107: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 108: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 109: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complement

I Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 110: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 111: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2

I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 112: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits only

I “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 113: Csr2011 june15 09_30_shen

. . . . . .

Computability theory: simple sets

I Simple set: enumerable set with infinite complement, butno algorithm can generate infinitely many elements fromthe complement

I Construction using Kolmogorov complexity: a simple stringx has C(x) ≤ |x|/2.

I Most strings are not simple ⇒ infinite complementI Let x1, x2, . . . be a computable sequence of differentnon-simple strings

I May assume wlog that |xi| > i and therefore C(xi) > i/2I but to specify xi we need O(log i) bits onlyI “Minimal integer that cannot be described in ten Englishwords” (Berry)

Page 114: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 115: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational terms

I is α =∑

ai computable (ε → ε-approximation)?I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 116: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 117: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)

I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 118: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable reals

I Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 119: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 120: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements

= random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 121: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 122: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging series

I modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 123: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 124: Csr2011 june15 09_30_shen

. . . . . .

Lower semicomputable random reals

I∑

ai computable converging series with rational termsI is α =

∑ai computable (ε → ε-approximation)?

I not necessarily (Specker example)I lower semicomputable realsI Solovay classification: α ≼ β if ε-approximation to β can beeffectively converted to O(ε)-approximation to α

I There are maximal elements = random lowersemicomputable reals

I = slowly converging seriesI modulus of convergence: ε 7→ N(ε) = how many terms areneeded for ε-precision

I maximal elements: N(2−n) > BP(n−O(1)), where BP(k)is the maximal integer whose prefix complexity is k or less.

Page 125: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 126: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 127: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variables

I several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 128: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in each

I each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 129: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighbors

I ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 130: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiable

I Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 131: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 132: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 133: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 134: Csr2011 june15 09_30_shen

. . . . . .

Lovasz local lemma: constructive proof

I CNF: (a ∧ ¬b ∧ . . .) ∨ (¬c ∧ e ∧ . . .) ∨ . . .

I neighbors: clauses having common variablesI several clauses with k literals in eachI each clause has o(2k) neighborsI ⇒ CNF is satisfiableI Non-constructive proof: lower bound for probability, Lovaszlocal lemma

I Naïve algorithm: just resample false clause (while theyexist)

I Recent breakthrough (Moser): this algorithm with highprobability terminates quickly

I Explanation: if not, the sequence of resampled clauseswould encode the random bits used in resampling makingthem compressible

Page 135: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, Raz

I There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 136: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < n

I for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 137: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 138: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?

I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 139: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 140: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 141: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x.

(Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 142: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 143: Csr2011 june15 09_30_shen

. . . . . .

Berry, Gödel, Chaitin, RazI There are only finitely many strings of complexity < nI for all strings (except finitely many ones) x the statementK(x) > n is true

I Can all true statements of this form be provable?I No, otherwise we could effectively generate string ofcomplexity > n by enumerating all proofs

I and get Berry’s paradox: the first provable statementC(x) > n for given n gives some x of complexity > n thatcan be described by O(log n) bits

I (Gödel theorem in Chaitin form): There are only finitelymany n such that C(x) > n is provable for some x. (Notethat ∃xC(x) > n is always provable!)

I (Gödel second theorem, Kritchman–Raz proof): the“unexpected test paradox” (a test will be given next weekbut it won’t be known before the day of the test)

Page 144: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)I Yes, even with continuous functions (Kolmogorov, Arnold)I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 145: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)I Yes, even with continuous functions (Kolmogorov, Arnold)I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 146: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)I Yes, even with continuous functions (Kolmogorov, Arnold)I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 147: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)

I Yes, even with continuous functions (Kolmogorov, Arnold)I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 148: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)I Yes, even with continuous functions (Kolmogorov, Arnold)

I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 149: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)I Yes, even with continuous functions (Kolmogorov, Arnold)I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 150: Csr2011 june15 09_30_shen

. . . . . .

13th Hilbert’s problem

I Function of ≥ 3 variables: e.g., solution of a polynomial ofdegree 7 as function of its coefficients

I Is it possible to represent this function as a composition offunctions of at most 2 variables?

I Yes, with weird functions (Cantor)I Yes, even with continuous functions (Kolmogorov, Arnold)I Circuit version: explicit function Bn × Bn × Bn → Bn

(polynomial circuit size?) that cannot be represented ascomposition of O(1) functions Bn × Bn → Bn. Not known.

I Kolmogorov complexity version: we have three stringsa,b, c on a blackboard. It is allowed to write (add) a newstring if it simple relative to two of the strings on the board.Which strings we can obtain in O(1) steps? Only strings ofsmall complexity relative to a,b, c, but not all of them (forrandom a,b, c)

Page 151: Csr2011 june15 09_30_shen

. . . . . .

Secret sharing

I secret s and three people; any two are able to reconstructthe secret working together; but each one in isolation hasno information about it

I assume s ∈ F (a field); take random a and tell the peoplea, a+ s and a+ 2s (assuming 2 = 0 in F)

I other secret sharing schemes; how long should be secrets(not understood)

I Kolmogorov complexity settings: for a given s find a,b, csuch that

C(s|a),C(s|b),C(s|c) ≈ C(s); C(s|a,b),C(s|a, c),C(s|b, c) ≈ 0

I Some relation between Kolmogorov and traditional settings(Romashchenko, Kaced)

Page 152: Csr2011 june15 09_30_shen

. . . . . .

Secret sharing

I secret s and three people; any two are able to reconstructthe secret working together; but each one in isolation hasno information about it

I assume s ∈ F (a field); take random a and tell the peoplea, a+ s and a+ 2s (assuming 2 = 0 in F)

I other secret sharing schemes; how long should be secrets(not understood)

I Kolmogorov complexity settings: for a given s find a,b, csuch that

C(s|a),C(s|b),C(s|c) ≈ C(s); C(s|a,b),C(s|a, c),C(s|b, c) ≈ 0

I Some relation between Kolmogorov and traditional settings(Romashchenko, Kaced)

Page 153: Csr2011 june15 09_30_shen

. . . . . .

Secret sharing

I secret s and three people; any two are able to reconstructthe secret working together; but each one in isolation hasno information about it

I assume s ∈ F (a field); take random a and tell the peoplea, a+ s and a+ 2s (assuming 2 = 0 in F)

I other secret sharing schemes; how long should be secrets(not understood)

I Kolmogorov complexity settings: for a given s find a,b, csuch that

C(s|a),C(s|b),C(s|c) ≈ C(s); C(s|a,b),C(s|a, c),C(s|b, c) ≈ 0

I Some relation between Kolmogorov and traditional settings(Romashchenko, Kaced)

Page 154: Csr2011 june15 09_30_shen

. . . . . .

Secret sharing

I secret s and three people; any two are able to reconstructthe secret working together; but each one in isolation hasno information about it

I assume s ∈ F (a field); take random a and tell the peoplea, a+ s and a+ 2s (assuming 2 = 0 in F)

I other secret sharing schemes; how long should be secrets(not understood)

I Kolmogorov complexity settings: for a given s find a,b, csuch that

C(s|a),C(s|b),C(s|c) ≈ C(s); C(s|a,b),C(s|a, c),C(s|b, c) ≈ 0

I Some relation between Kolmogorov and traditional settings(Romashchenko, Kaced)

Page 155: Csr2011 june15 09_30_shen

. . . . . .

Secret sharing

I secret s and three people; any two are able to reconstructthe secret working together; but each one in isolation hasno information about it

I assume s ∈ F (a field); take random a and tell the peoplea, a+ s and a+ 2s (assuming 2 = 0 in F)

I other secret sharing schemes; how long should be secrets(not understood)

I Kolmogorov complexity settings: for a given s find a,b, csuch that

C(s|a),C(s|b),C(s|c) ≈ C(s); C(s|a,b),C(s|a, c),C(s|b, c) ≈ 0

I Some relation between Kolmogorov and traditional settings(Romashchenko, Kaced)

Page 156: Csr2011 june15 09_30_shen

. . . . . .

Secret sharing

I secret s and three people; any two are able to reconstructthe secret working together; but each one in isolation hasno information about it

I assume s ∈ F (a field); take random a and tell the peoplea, a+ s and a+ 2s (assuming 2 = 0 in F)

I other secret sharing schemes; how long should be secrets(not understood)

I Kolmogorov complexity settings: for a given s find a,b, csuch that

C(s|a),C(s|b),C(s|c) ≈ C(s); C(s|a,b),C(s|a, c),C(s|b, c) ≈ 0

I Some relation between Kolmogorov and traditional settings(Romashchenko, Kaced)

Page 157: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some bI by sending some message fI in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 158: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information a

I Bob wants to let her know some bI by sending some message fI in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 159: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some b

I by sending some message fI in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 160: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some bI by sending some message f

I in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 161: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some bI by sending some message fI in such a way that Eve gets minimal information about b

I Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 162: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some bI by sending some message fI in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 163: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some bI by sending some message fI in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 164: Csr2011 june15 09_30_shen

. . . . . .

Quasi-cryptography

I Alice has some information aI Bob wants to let her know some bI by sending some message fI in such a way that Eve gets minimal information about bI Formally: for given a, b find f such that C(b|a, f) ≈ 0 andC(b|f) → max.

I Theorem (Muchnik): it is always possible to haveC(b|f) ≈ min(C(b),C(a))

I Full version: Eve knows some c and we want to sendmessage of a minimal possible length C(b|a)

Page 165: Csr2011 june15 09_30_shen

. . . . . .

Andrej Muchnik (1958-2007)