CS 121: Lecture 3 Representations - Harvard University

Post on 01-Jun-2022

4 views 0 download

Transcript of CS 121: Lecture 3 Representations - Harvard University

CS 121: Lecture 3Representations

Madhu Sudan

https://madhu.seas.Harvard.edu/courses/Fall2020

Book: https://introtcs.org

Only the course heads (slower): cs121.fall2020.course.heads@gmail.com{How to contact usThe whole staff (faster response): CS 121 Piazza

Announcements• HW0 graded; solutions posted; Feedback sent.• HW1 out; due 1 week from today.• Section 1 material + video posted. • New expectations:

• Watch video in section: Thu+Fri.• Must “pre-watch” video for sections: Sat.-Wed.

• Reminder Boaz Barak on “Compression, Coding and Entropy” – today at 4:30! (Canvas ՜ Zoom ՜ CS 121.5)

Today• Main message: Can represent “everything” with 0,1 כ

• Will define “represent” and explain “everything”• Lesson 1: Can represent Գ with 0,1 כ

• Break 1:• Lesson 2: Representing Գ × Գ with• Break 2: • Lesson 3: Prefix-free representations. Representing Գכ

• End: Can’t Represent Թ

Representations: Motivation• Computers manipulate data. But what is data?

• Can all data be expressed as bits?• Answer 1: Yes – don’t we already do it?• Answer 2: Obviously! There are so many ways to do it! Pick your favorite.

• Today’s Central Player: 0,1 כ = Գא 0,1

• All finite length binary strings

. "

O"""

Definition: A representation scheme for a set of objects ࣩ is a one to one function ࣩ:ܧ ՜ כ{0,1}

Definition: A representation scheme for a set of objects ࣩ is a one to one function ࣩ:ܧ ՜ כ{0,1} Equivalently: There is ܦ: כ{0,1} ՜ ࣩ

s.t. ܦ ܧ ݔ = ݔ for every ݔ א ࣩ

y Encoding

? Decoding

-

XE E (D Cx) )

Definition: A representation scheme for a set of objects ࣩ is a one to one function ࣩ:ܧ ՜ כ{0,1}

Much research on “good” representations:• Effectiveness: Can compute encoding and decoding• Compression: Representation with small size (e.g., JPEG)• Error correction: Representation that is robust to errors (e.g., “control digits”,

error correcting codes)• Data structures: Representation enabling fast operations (e.g., binary numbers,

distance oracles)• Feature extraction: Representation enabling prediction (e.g., deep nets)• Secrecy: Representation hiding certain information (e.g., encryption)

Today: Simple representations for standard objects.

Equivalently: There is ܦ: כ{0,1} ՜ ࣩs.t. ܦ ܧ ݔ = ݔ for every ݔ א ࣩ

-

- -

Binary representationOne to one function ܧ:Գ ՜ כ{0,1}

64 32 16 8 4 2 1

ܧ 83 =

ܧ 17 =

16 8 4 2 1

D (0110 ) ? 6

D ( 110 ) = 6

I 0 I O O l T

IECXHE ( log x )

Binary representationOne to one function ܧ:Գ ՜ כ{0,1}

()2 104 1006 1100 01 1

1 0 1 0 0 1 1

64 32 16 8 4 2 1

ܧ 83 = 1010011

ܧ 17 = 10001

1 0 0 0 1

16 8 4 2 1“I always work

in base 10.”

- -

Exercise 1: • Give a representation of 3 כ = 0,1,2 כ as binary strings.• Specifically give the encoding function• “Prove” it is one-to-one (or give decoder)

• Give an upper bound on the ratio ா ௫௫

of your representation

• Now improve the ratio!

?D-

O'

*in the limit txt → a .

Exercise T : Solutions

0 : Question is asking for one-to-one function E :O → So , is

where0=801,23*1.

Our simple solutionis

⑨ let e :{oil , 23 → So,B'

be the map

e. (01=00 ; ell)= 11 ; @

(2) = 10

then it d. , go.BZ → SO ,1,23 is the function

dloo) -- O ,

dal)= I ,d 40)=2 ,

DAD = O-

ther t x C- Eon ,23 we have udlelx)) -- X .

So e is 4 - T.

④ het E :{ on ,z3*→ So ,B* be geircn by

E- (Xo . . Xm ,)= @ (Xo) @ Cx .) - . . ecxn . ) .

Let D(yo . . . Gm . ,)= dlyoy ,)d(yay,) - - - dlym.im,)

if moren

⇐ 0 if m odd

then FXE {011,23$ wehave

DCECX)) -- X .

aerify this !)

② this solution achieves IEM = 2 FX .

Txt

2 .A more

"

length - efhient"

solution would go as following .

{on , 23$ {0,123$ IN {o,B*

Where#

• p ( Xo - - Xm) -- Xo -- - Xn-it ← append T to string .

n- I

• F. (Yo . . -Ym) = Eye. 3"

i -- o

• ECA) = (A mod 2) Fz(LE ))is the binaryrepresentation of A .

To see that the composition F-Fz of o Pis one- to - one

u verify that Fop is one .to - lone

[even though F is not ! )

• Roedl ( from earlierin the lecture) that E is one-to-one

o Composition preservesone-to-one - ness .

3. To analyze performance of F we note

( i ) top (Xo .. Xm,) s 3h12

Lii ) IECA) Is logz A tI

⇒ IFCX . . - Xn . . ) / s log," Itt = (ht2) toga's t I

⇒Lin;zlHXo--nX = log,

3.

Representing rational numbers

Lemma: There is a one to one function ܧ: כ{0,1} × כ{0,1} ՜ כ{0,1}

ܧ encodes a pair of strings as a single string.

What about ܧ Ԣݔݔ = ?Ԣݔݔ

(You) Will prove lemma shortly!

ܧ,ݔ Ԣݔ ݕ

Corollary: Can represent Գ × Գ as strings.

Corollary: Can represent rational numbers as strings.

Proof of Corollary:Lemma: There is a one to one function ܧ: כ{0,1} × כ{0,1} ՜ כ{0,1}

Corollary: Can represent Գ × Գ as strings.o

*xN'"

so so,BYE, so , is.

Prefix freenessDefinition: ܧ:ࣩ ՜ כ{0,1} is prefix free if for every ݔ ് ,Ԣݔ(ݔ)ܧ is not prefix of ܧ(ݔᇱ)

1 0 1 0 0 1 1

1 0 1Example: 101 is prefix of 1010011

English is not prefix free: teacherstalking.com

- -

e- -

Prefix freenessDefinition: ܧ:ࣩ ՜ כ{0,1} is prefix free if ݔ ് (ݔ)ܧ ,Ԣݔ is not prefix of ܧ(ݔᇱ)

Theorem (2.18): “If ܧ is prefix free then we can use it to encode pairs/lists”ܧ prefix free ֜ כࣩ:ᇱܧ ՜ כ{0,1} defined as ܧᇱ ,ݔ ,ଵݔ … , ݔ = ܧ ݔ ܧ ଵݔ ܧ… ݔ is one to one

Lemma (2.20): “Every encoding can be converted into a prefix-free one”If ܧ:ࣩ ՜ ࣩ:ᇱܧ then there exists prefix-free one-to-one ,כ{0,1} ՜ כ{0,1}

numbers lists of numbers

lists of lists of numbers = matrices

imagesLists of lists of images = videos

Exercise 2:• Give a prefix free mapping: ܧ: 0,1 כ ՜ 0,1 כ

• Hint (example): 0010100 հ 0010100#

• Give an upper bound on the ratio ா ௫௫

for your ܧ.

• (If you have extra time, try to think of ܧ’s that improve the ratio.)

SolutionWARNING : I made some incorrect claims in lecture .

This is the corrected sanitized version .

The idea for our basic encoding E is simple .

so ,is €, so , 1,23

. ¥ Eo,B-

where E, :{ o ,B- → Eon ,25$ is given by

E,( Xo . - xn-p ) = Xo . . . Xna 2-

& E:{ on ,23$ → Eo , 13.

is given by

Ez(yo .. ym . . ) = ezlyo) Ealy,) . . ezlym .)

where ez :{on ,23 → So ,Bt is given by

Edo) -- OO ,edD= 11

,ez(2) = 10 .

To provecorrectness we make following claims :

① E ,is prefix - free } Verify yourselves .

② Ea is prefix - free

③ if E = Ezo E,where Fez (Yo - -ym.D-edyot.cz/ym.D

& E , e ez are prefix-free ,then so is E .

Notes the transformation ca → Ez is exactly the onefrom Theorem 2.18 in book ( mentioned earlier

in lecture ) .

Thm 2-18 says Ez is one - tot one .

② I claimed E , prefix - free t Ez one - to- one

implies E is prefix - free .This is NOT CORRECT

Example dueto Adam H

.

° Let F :{o , i. 23$ → So ,Bt be the mapfrom

solution 2 to Exercise'

l .

° Using Ez-

- F a E , as in this solution ,

we get Ecepnptystring)= 101

& E ( O ) = lol I

Note that ① +② t ③ ⇒ E is prefix - free .

Proof of③ : . Suppose ECX)-

- yo . . yn . , i EG)-

-Yo . - Gn. . Yn - - Ym. ,

& so Ecx) is a prefix of Ek) .

• Let dz be the decoder corresponding to e, and

Bz the left -to - right decoder for Ez based on

@ 2 . [ So Dz(yo . . yn . .) = delYo .-Yi) dzttit . - - Yj) . -- ]

then ~Dz(yo . - yn . , )= Wo - - Wai

& Bz(Yo . .. Ym. .)

= Vo - -- - Vb. ,

satisfy woe Vo , Wi- V

.. - . Way = Va- i

° In other word Wo . . Wa . , is a prefix of

Vor r - Vb- I

'

But Wo .. Vava. , = E , (x )

Vo . . . Vb. ,=E, (2)

& this implies E. (x) is a prefix of Elz)

contradicting" E ,

is prefix - free"

Performance of our solution-.

1LI = ?

1×1

if 1×1 -- n ,

then IE. Cx) )= htt

& ⇐ (E. HH -

- 2cm)

⇒ Ijm,-

- fi: 275=2 .

Improvedsolntion : the *

Claim : if E :{aBFEQB*-

na prefix - free , Eo'

- IN → { o ,B

solution from earlieris 4- to - T

-

then is E-( x) EE#txt) ) xis prefix - free

him =L

(Xt>A 1×1

Proof . ii ) is easy - letsdo that first .

if lxt-hth.in/EoClxl)lEl0g.zntI

⇒ E ( Eollxl)) s 2 dognt 2)⇒ I E-G)Is nt2bgnt4

⇒ Ling, nt2bsn = T.

C) Assume again for contradiction-that Ek) is a prefix

of EG) .

Then since E ( Eollxl)) is a prefix of

E- Cx) & E (Eoka)) is a prefix of E- ( z),

we have

⑨ ELEM)= prefix of E (Edm))

or ④ E (Eotd)=

i ' E (Eo ( 1×1))

or ② E (Eollxl)) = E (Eo1121))

But @ & ③ can 't happen sine E is prefix

free .

& if ② happens then 1×1=121 HEI -Eirine))& # Axl is a prefix of 121 ⇒ X=2

E.g., Representing Graphs• ܩ = ܧ,ܸ ; ܸ = [݊] ܧ ; ك ܸ × ܸ

• Let Eric: N → So ,Bt be prefix free

. Let EEVXV = { Cini ,) ..

. .

Cim,Jm) } & V -- Ln]

° then Eng : Graphs → { 0,13$ is given by

Enecn) Enllii) Ends'D . .. .

Enclim) Encljm) .

Example: Representing Matrix• ܯ א Ժ×:• First represent “list”

• Then represent “list of lists”

Let E : a→ So , B'be prefix- free

-

.

Eeistllii - -- in)) -- Eln) Eli ,) . . .

Elin)

is prefix - free !

Ema, ( ( an - - - Ain) - - . . Cami . - - Amn))

= Elm) Eeistlan . - am)) Eeistlasi . - 92nA -- Eeisllhmi - aim))

is abo prefix - free . no

Prefix-free encoding in practice• “C style strings”: null terminated• “Pascal style”: encode ݔ א {0,1}ஸଶହହ as ݔ , ݔ

..both led to many security breaks

TLS Heartbeat protocolCheck connection is alive:

ݔ , ݔ

ݔ , ݔ

Heartbleed attack

“Some might argue that it is the worst vulnerability found (at least in terms of its potential impact) since commercial traffic began to flow on the Internet.”,

Joseph Steinberg , Forbes.

Heartbleed attack

Heartbleed attack

Heartbleed attack

Heartbleed attack

Can we represent everything?Unfortunate Fact (Thm 2.5):“Can’t represent real numbers as strings”There is no one-to-one function ܴܵݐ:Թ ՜ כ{0,1}

ImplicationsThesis: Everything representable as 0,1 כ !

Everything = Ժכ

Or Everything = 0,1 כ

Or Everything = Keyboard כ

Includes Music? All sounds? All images?

All Smell? People (“Beam me up, Scotty!”)

Rest of the course:Part I: Circuits: Finite computation, quantitative study

Part II: Automata: Infinite restricted computation, quantitative study

Part III: Turing Machines: Infinite computation, qualitative study

Part IV: Efficient Computation: Infinite computation, quantitative study

Part V: Randomized computation: Extending studies to non-classical algorithms