CS 121: Lecture 3 Representations - Harvard University
Transcript of CS 121: Lecture 3 Representations - Harvard University
CS 121: Lecture 3Representations
Madhu Sudan
https://madhu.seas.Harvard.edu/courses/Fall2020
Book: https://introtcs.org
Only the course heads (slower): [email protected]{How to contact usThe whole staff (faster response): CS 121 Piazza
Announcements• HW0 graded; solutions posted; Feedback sent.• HW1 out; due 1 week from today.• Section 1 material + video posted. • New expectations:
• Watch video in section: Thu+Fri.• Must “pre-watch” video for sections: Sat.-Wed.
• Reminder Boaz Barak on “Compression, Coding and Entropy” – today at 4:30! (Canvas ՜ Zoom ՜ CS 121.5)
Today• Main message: Can represent “everything” with 0,1 כ
• Will define “represent” and explain “everything”• Lesson 1: Can represent Գ with 0,1 כ
• Break 1:• Lesson 2: Representing Գ × Գ with• Break 2: • Lesson 3: Prefix-free representations. Representing Գכ
• End: Can’t Represent Թ
Representations: Motivation• Computers manipulate data. But what is data?
• Can all data be expressed as bits?• Answer 1: Yes – don’t we already do it?• Answer 2: Obviously! There are so many ways to do it! Pick your favorite.
• Today’s Central Player: 0,1 כ = Գא 0,1
• All finite length binary strings
. "
O"""
Definition: A representation scheme for a set of objects ࣩ is a one to one function ࣩ:ܧ ՜ כ{0,1}
Definition: A representation scheme for a set of objects ࣩ is a one to one function ࣩ:ܧ ՜ כ{0,1} Equivalently: There is ܦ: כ{0,1} ՜ ࣩ
s.t. ܦ ܧ ݔ = ݔ for every ݔ א ࣩ
y Encoding
.÷
? Decoding
-
XE E (D Cx) )
Definition: A representation scheme for a set of objects ࣩ is a one to one function ࣩ:ܧ ՜ כ{0,1}
Much research on “good” representations:• Effectiveness: Can compute encoding and decoding• Compression: Representation with small size (e.g., JPEG)• Error correction: Representation that is robust to errors (e.g., “control digits”,
error correcting codes)• Data structures: Representation enabling fast operations (e.g., binary numbers,
distance oracles)• Feature extraction: Representation enabling prediction (e.g., deep nets)• Secrecy: Representation hiding certain information (e.g., encryption)
Today: Simple representations for standard objects.
Equivalently: There is ܦ: כ{0,1} ՜ ࣩs.t. ܦ ܧ ݔ = ݔ for every ݔ א ࣩ
-
- -
Binary representationOne to one function ܧ:Գ ՜ כ{0,1}
64 32 16 8 4 2 1
ܧ 83 =
ܧ 17 =
16 8 4 2 1
D (0110 ) ? 6
D ( 110 ) = 6
I 0 I O O l T
IECXHE ( log x )
Binary representationOne to one function ܧ:Գ ՜ כ{0,1}
()2 104 1006 1100 01 1
1 0 1 0 0 1 1
64 32 16 8 4 2 1
ܧ 83 = 1010011
ܧ 17 = 10001
1 0 0 0 1
16 8 4 2 1“I always work
in base 10.”
- -
Exercise 1: • Give a representation of 3 כ = 0,1,2 כ as binary strings.• Specifically give the encoding function• “Prove” it is one-to-one (or give decoder)
• Give an upper bound on the ratio ா ௫௫
of your representation
• Now improve the ratio!
?D-
O'
*in the limit txt → a .
Exercise T : Solutions
0 : Question is asking for one-to-one function E :O → So , is
where0=801,23*1.
Our simple solutionis
⑨ let e :{oil , 23 → So,B'
be the map
e. (01=00 ; ell)= 11 ; @
(2) = 10
then it d. , go.BZ → SO ,1,23 is the function
dloo) -- O ,
dal)= I ,d 40)=2 ,
DAD = O-
ther t x C- Eon ,23 we have udlelx)) -- X .
So e is 4 - T.
④ het E :{ on ,z3*→ So ,B* be geircn by
E- (Xo . . Xm ,)= @ (Xo) @ Cx .) - . . ecxn . ) .
Let D(yo . . . Gm . ,)= dlyoy ,)d(yay,) - - - dlym.im,)
if moren
⇐ 0 if m odd
then FXE {011,23$ wehave
DCECX)) -- X .
aerify this !)
② this solution achieves IEM = 2 FX .
Txt
2 .A more
"
length - efhient"
solution would go as following .
{on , 23$ {0,123$ IN {o,B*
Where#
• p ( Xo - - Xm) -- Xo -- - Xn-it ← append T to string .
n- I
• F. (Yo . . -Ym) = Eye. 3"
i -- o
• ECA) = (A mod 2) Fz(LE ))is the binaryrepresentation of A .
To see that the composition F-Fz of o Pis one- to - one
u verify that Fop is one .to - lone
[even though F is not ! )
• Roedl ( from earlierin the lecture) that E is one-to-one
o Composition preservesone-to-one - ness .
3. To analyze performance of F we note
( i ) top (Xo .. Xm,) s 3h12
Lii ) IECA) Is logz A tI
⇒ IFCX . . - Xn . . ) / s log," Itt = (ht2) toga's t I
⇒Lin;zlHXo--nX = log,
3.
Representing rational numbers
Lemma: There is a one to one function ܧ: כ{0,1} × כ{0,1} ՜ כ{0,1}
ܧ encodes a pair of strings as a single string.
What about ܧ Ԣݔݔ = ?Ԣݔݔ
(You) Will prove lemma shortly!
ܧ,ݔ Ԣݔ ݕ
Corollary: Can represent Գ × Գ as strings.
Corollary: Can represent rational numbers as strings.
Proof of Corollary:Lemma: There is a one to one function ܧ: כ{0,1} × כ{0,1} ՜ כ{0,1}
Corollary: Can represent Գ × Գ as strings.o
*xN'"
so so,BYE, so , is.
Prefix freenessDefinition: ܧ:ࣩ ՜ כ{0,1} is prefix free if for every ݔ ് ,Ԣݔ(ݔ)ܧ is not prefix of ܧ(ݔᇱ)
1 0 1 0 0 1 1
1 0 1Example: 101 is prefix of 1010011
English is not prefix free: teacherstalking.com
- -
e- -
Prefix freenessDefinition: ܧ:ࣩ ՜ כ{0,1} is prefix free if ݔ ് (ݔ)ܧ ,Ԣݔ is not prefix of ܧ(ݔᇱ)
Theorem (2.18): “If ܧ is prefix free then we can use it to encode pairs/lists”ܧ prefix free ֜ כࣩ:ᇱܧ ՜ כ{0,1} defined as ܧᇱ ,ݔ ,ଵݔ … , ݔ = ܧ ݔ ܧ ଵݔ ܧ… ݔ is one to one
Lemma (2.20): “Every encoding can be converted into a prefix-free one”If ܧ:ࣩ ՜ ࣩ:ᇱܧ then there exists prefix-free one-to-one ,כ{0,1} ՜ כ{0,1}
numbers lists of numbers
lists of lists of numbers = matrices
imagesLists of lists of images = videos
…
Exercise 2:• Give a prefix free mapping: ܧ: 0,1 כ ՜ 0,1 כ
• Hint (example): 0010100 հ 0010100#
• Give an upper bound on the ratio ா ௫௫
for your ܧ.
• (If you have extra time, try to think of ܧ’s that improve the ratio.)
SolutionWARNING : I made some incorrect claims in lecture .
This is the corrected sanitized version .
The idea for our basic encoding E is simple .
so ,is €, so , 1,23
. ¥ Eo,B-
where E, :{ o ,B- → Eon ,25$ is given by
E,( Xo . - xn-p ) = Xo . . . Xna 2-
& E:{ on ,23$ → Eo , 13.
is given by
Ez(yo .. ym . . ) = ezlyo) Ealy,) . . ezlym .)
where ez :{on ,23 → So ,Bt is given by
Edo) -- OO ,edD= 11
,ez(2) = 10 .
To provecorrectness we make following claims :
① E ,is prefix - free } Verify yourselves .
② Ea is prefix - free
③ if E = Ezo E,where Fez (Yo - -ym.D-edyot.cz/ym.D
& E , e ez are prefix-free ,then so is E .
Notes the transformation ca → Ez is exactly the onefrom Theorem 2.18 in book ( mentioned earlier
in lecture ) .
Thm 2-18 says Ez is one - tot one .
② I claimed E , prefix - free t Ez one - to- one
implies E is prefix - free .This is NOT CORRECT
Example dueto Adam H
.
° Let F :{o , i. 23$ → So ,Bt be the mapfrom
solution 2 to Exercise'
l .
° Using Ez-
- F a E , as in this solution ,
we get Ecepnptystring)= 101
& E ( O ) = lol I
Note that ① +② t ③ ⇒ E is prefix - free .
Proof of③ : . Suppose ECX)-
- yo . . yn . , i EG)-
-Yo . - Gn. . Yn - - Ym. ,
& so Ecx) is a prefix of Ek) .
• Let dz be the decoder corresponding to e, and
Bz the left -to - right decoder for Ez based on
@ 2 . [ So Dz(yo . . yn . .) = delYo .-Yi) dzttit . - - Yj) . -- ]
then ~Dz(yo . - yn . , )= Wo - - Wai
& Bz(Yo . .. Ym. .)
= Vo - -- - Vb. ,
satisfy woe Vo , Wi- V
.. - . Way = Va- i
° In other word Wo . . Wa . , is a prefix of
Vor r - Vb- I
'
But Wo .. Vava. , = E , (x )
Vo . . . Vb. ,=E, (2)
& this implies E. (x) is a prefix of Elz)
contradicting" E ,
is prefix - free"
Performance of our solution-.
1LI = ?
1×1
if 1×1 -- n ,
then IE. Cx) )= htt
& ⇐ (E. HH -
- 2cm)
⇒ Ijm,-
- fi: 275=2 .
Improvedsolntion : the *
Claim : if E :{aBFEQB*-
na prefix - free , Eo'
- IN → { o ,B
solution from earlieris 4- to - T
-
then is E-( x) EE#txt) ) xis prefix - free
him =L
(Xt>A 1×1
Proof . ii ) is easy - letsdo that first .
if lxt-hth.in/EoClxl)lEl0g.zntI
⇒ E ( Eollxl)) s 2 dognt 2)⇒ I E-G)Is nt2bgnt4
⇒ Ling, nt2bsn = T.
C) Assume again for contradiction-that Ek) is a prefix
of EG) .
Then since E ( Eollxl)) is a prefix of
E- Cx) & E (Eoka)) is a prefix of E- ( z),
we have
⑨ ELEM)= prefix of E (Edm))
or ④ E (Eotd)=
i ' E (Eo ( 1×1))
or ② E (Eollxl)) = E (Eo1121))
But @ & ③ can 't happen sine E is prefix
free .
& if ② happens then 1×1=121 HEI -Eirine))& # Axl is a prefix of 121 ⇒ X=2
E.g., Representing Graphs• ܩ = ܧ,ܸ ; ܸ = [݊] ܧ ; ك ܸ × ܸ
• Let Eric: N → So ,Bt be prefix free
. Let EEVXV = { Cini ,) ..
. .
Cim,Jm) } & V -- Ln]
° then Eng : Graphs → { 0,13$ is given by
Enecn) Enllii) Ends'D . .. .
Enclim) Encljm) .
Example: Representing Matrix• ܯ א Ժ×:• First represent “list”
• Then represent “list of lists”
Let E : a→ So , B'be prefix- free
-
.
Eeistllii - -- in)) -- Eln) Eli ,) . . .
Elin)
is prefix - free !
Ema, ( ( an - - - Ain) - - . . Cami . - - Amn))
= Elm) Eeistlan . - am)) Eeistlasi . - 92nA -- Eeisllhmi - aim))
is abo prefix - free . no
Prefix-free encoding in practice• “C style strings”: null terminated• “Pascal style”: encode ݔ א {0,1}ஸଶହହ as ݔ , ݔ
..both led to many security breaks
TLS Heartbeat protocolCheck connection is alive:
ݔ , ݔ
ݔ , ݔ
Heartbleed attack
“Some might argue that it is the worst vulnerability found (at least in terms of its potential impact) since commercial traffic began to flow on the Internet.”,
Joseph Steinberg , Forbes.
Heartbleed attack
Heartbleed attack
Heartbleed attack
Heartbleed attack
Can we represent everything?Unfortunate Fact (Thm 2.5):“Can’t represent real numbers as strings”There is no one-to-one function ܴܵݐ:Թ ՜ כ{0,1}
ImplicationsThesis: Everything representable as 0,1 כ !
Everything = Ժכ
Or Everything = 0,1 כ
Or Everything = Keyboard כ
Includes Music? All sounds? All images?
All Smell? People (“Beam me up, Scotty!”)
Rest of the course:Part I: Circuits: Finite computation, quantitative study
Part II: Automata: Infinite restricted computation, quantitative study
Part III: Turing Machines: Infinite computation, qualitative study
Part IV: Efficient Computation: Infinite computation, quantitative study
Part V: Randomized computation: Extending studies to non-classical algorithms