1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff...

17
1 Cryptanalysis • Four kinds of attacks (recall) • The objective: determine the key (Herckhoff principle) • Assumption: English plaintext text • Basic techniques: frequency analysis based on: – Probabilities of occurrences of 26 letters – Common digrams and trigrams.

Transcript of 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff...

Page 1: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

1

Cryptanalysis • Four kinds of attacks (recall)

• The objective: determine the key (Herckhoff principle)

• Assumption: English plaintext text

• Basic techniques: frequency analysis based on:– Probabilities of occurrences of 26 letters– Common digrams and trigrams.

Page 2: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

2

Cryptanalysis -- statistical analysis• Probabilities of occurrences of 26 letters

– E, having probability about 0.120 (12%)– T,A,O,I,N,S,H,R, each between 0.06 and 0.09– D,L, each around 0.04– C,U,M,W,F,G,Y,P,B, each between 0.015 and 0.028– V,K,J,X,Q,Z, each less than 0.01– See table 1.1, page 26

• 30 common digrams (in decreasing order):– TH, HE, IN, ER, AN, RE,…

• 12 common trigrams (in decreasing order):– THE, ING,AND,HER,ERE,…

Page 3: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

3

Cryptanalysis of Affine Cipher• Suppose a attacker got the following Affine cipher

– FMXVEDKAPHFERBNDKRXRSREFNORUDSDKDVSHVUFEDKAPRKDLYEVLRHHRH

• Cryptanalysis steps:– Compute the frequency of occurrences of letters

• R: 8, D:7, E,H,K:5, F,S,V: 4 (see table 1.2, page 27)– Guess the letters, solve the equations, decrypt the cipher, judge correct or not.

• First guess: Re, Dt, i.e., eK(4)=17, eK(19)=3– Thus, 4a+b=17 a=6, b=19, since gcd (6,26)=2, so incorrect.

19a+b=3• Next guess: Re, Et, the result will be a=13, not correct.• Guess again: Re, Ht, the result will be a=8, not correct again.• Guess again: Re, Kt, the result will be a=3, b=5.

– K=(3,5), eK(x)=3x+5 mod 26, and dK(y)=9y-19 mod 26.– Decrypt the cipher: algorithmsarequitegeneraldefinitionsofarithmeticprocesses

• If the decrypted text is not meaningful, try another guess.• Need programming: compute frequency and solve equations• Since Affine cipher has 12*26=312 keys, can write a program to try all keys.

Page 4: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

4

Cryptanalysis of substitution cipher• Final goal is to find the corresponding plaintext

letter for each ciphertext letter.• Ciphertext: example 1.11, page 28• Steps:

– Frequency computation, see table 1.3, page 29• Guess Ze, quite sure• C,D,F,J,M,R,Y are t,a,o,i,n,s,h,r, but not exact

– Look at digrams, especially –Z or Z-. • Since ZW occurs 4 times, but no WZ, so guess Wd

(because ed is a common digram, but not de)• Continue to guess

– Look at the trigrams, especially THE, ING, AND,…

Page 5: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

5

Cryptanalysis of Vigenere cipher

• In some sense, the cryptanalysis of Vigenere cipher is a systematic method and can be totally programmed.

• Step 1: determine the length m of the keyword– Kasiski test and index of coincidence

• Step 2: determine K=(k1,k2,…,km)

– Determine each ki separately.

Page 6: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

6

Kasiski test—determine keyword length m

• Observation: two identical plaintext segments will be encrypted to the same ciphertext whenever they appear positions apart in plaintext, where 0 mod m. Vice Versa.

• So search ciphertext for pairs of identical segments, record the distance between their starting positions, such as 1, 2,…, then m should divide all of i’s. i.e., m divides gcd of all i’s.

Page 7: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

7

Index of coincidence• Can be used to determine m as well as to confirm m,

determined by Kasiski test• Definition: suppose x=x1x2,…,xn is a string of length

n. The index of coincidence of x, denoted by Ic(x), is defined to be the probability that two random elements of x are identical.– Denoted the frequencies of A,B,…,Z in x by f0,f1,…,f25

--Ic(x)=i=0

25

( )fi

2

( )n2

= i=025

fi(fi-1)

n(n-1)( Formula IC )

Page 8: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

8

Suppose x is a string of English text, denote the expectedprobability of occurrences of A,B,…,Z by p0,p1,…,p25 withvalues from table 1.1, then Ic(x) pi

2 =0.0822+0.0152+…+0.0012=0.065(since the probability that two random elements both are A is p0

2, both are B is p12,…)

Index of coincidence (cont.)

Question: if y is a ciphertext obtained by shift cipher, what is the Ic(y)?

Answer: should be 0.065, because the individual probabilities will be permuted, but the pi

2 will be unchanged.

Therefore, suppose y=y1y2…yn is the ciphertext from Vigenere cipher.For any given m, divide y into m substrings:

y1=y1ym+1y2m+1… if m is indeed the keyword length, then y2=y2ym+2y2m+2… each yi is a shift cipher, Ic(yi) is about 0.065. …ym=ymy2my3m… otherwise, Ic(yi) 26(1/26)2 = 0.038.

Page 9: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

9

Index of coincidence (cont.)For purpose of verify keyword length m, divide the ciphertext into m substrings, compute the index of coincidence by formula IC for each substring. If all IC valuesof the substrings are around 0.065, then m is the correct keyword length. Otherwisem is not the correct keyword length.

If want to use Ic to determine correct keyword length m, what to do?

Beginning from m=2,3, … until an m, for which all substrings haveIC value around 0.065.

Now, how to determine keyword K=(k1,k2,…,km)? Assume m is given.

Page 10: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

10

Determine keyword K=(k1,k2,…,km)1. Determine each ki (from yi) independently.2. Observation:

2.1 let f0,f1,…,f25 denote the frequencies of A,B,…,Z in yi and n′=n/m2.2 then probability distribution of 26 letters in yi is:

f0 f25

n′, , n′2.3 if the shift key is ki, then f0+ki

(i.e., A+ki) is the frequency of

a in the corresponding plaintext xi , …, f25+ki (note the subscript

25+ki should be computed by modulo 26) is the frequency of z in xi. Since xi is normal English text, probability distribution of

f0+ki f25+ki

n′, , n′

should be “close to” ideal probabilitydistribution p0,p1,…,p25.

p0, …, p25

So: f0+ki

n'p0 +…+

f25+ki

n'p25

p02+…+p25

2 =0.065

Page 11: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

11

Determine keyword K=(k1,k2,…,km) (cont.)

Therefore, define:fi+g

Mg=i=025 pi

n′When g=ki, Mg will generally be around 0.065 (i.e., i=0

25 pi2).

Otherwise Mg will be quite smaller than 0.065.

So let g from 0, until 25, compute Mg, and for some g, if Mg is around 0.065, then ki=g.

Note: the subscript i+g should be seen as modulo 26.

f0+g

n'p0 +…+

f25+g

n'p25

On the other hand, for any g !=ki,

will not be close to 0.065.

Page 12: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

12

Cryptanalysis of Vigenere cipher--example• Example 1.12, page 33.

– Using Kasiski test to determine the keyword length• CHR appears five times at 1,166,236,276,286• the distance is 165, 235,275,285, the gcd is 5, so m=5.

– Using index of coincidence to verify m=5.• Divide ciphertext into y1, y2, y3, y4, y5

• Compute f0,f1,…,f25 for each yi and then Ic(yi), get 0.063, 0.068,0.069,0.061,0.072, so m=5 is correct.

– Determine ki for i=1,…,5.• Compute Mg for g=0,1,…,25 and if Mg 0.065, then let

ki=g. where Mg=i=025 pi

n′

fi+g

As a result, k1=9,k2=0,k3=13,k4=4,k5=19, i.e., JANET

Page 13: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

13

Cryptanalysis of Hill cipher• Difficult to break based on ciphertext only

• Easily to break based on both ciphertext and plaintext.

• Suppose given at least m distinct plaintext-ciphertext pairs: xj=(x1,j,x2,j,…,xm,j)

yj=(y1,j,y2,j,…,ym,j)

then define two matrices X=(xi,j) and Y=(yi,j)

Let Y=XK, if X is invertible, then K=X-1Y.

Page 14: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

14

• Suppose plaintext is: friday

and ciphertext is: PQCFKU

and the m=2. Then eK(f,r)=(P,Q), eK(i,d)=(C,F).

That is:

Cryptanalysis of Hill cipher--example

( ) = ( )K15 16 2 5

5 178 3

K=( )-1( )=( )( )=( )5 178 3

15 16 2 5

9 12 15

15 16 2 5

7 198 3

Then using the third pair, i.e., (a,y) and (K,U) to verify K.

In case m is unknown, try m=2,3, …

Page 15: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

15

Cryptanalysis of LFSR stream cipher• Vulnerable to known-plaintext attack.

• Suppose m, plaintext binary string x1,x2,…,xn and ciphertext binary string y1,y2,…,yn are known, as long as n>2m, the key can be broken:– Keystream is: zi=(xi+yi) mod 2. (i=1,2,…,n)

– Then the initialization vector of K is z1,…, zm.

– Next is to determine coefficients (c0,c1,…,cm-1) of K

(recall that zi+m = m-1j=0cjzi+j mod 2 for all i1)

i.e,

–(zm+1,zm+2,…,z2m)=(c0,c1,…,cm-1) z1 z2 … zm

z2 z3 … zm+1

……………zm zm+1 … z2m-1

Page 16: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

16

Cryptanalysis of LFSR stream cipher (cont.)

–(c0,c1,…,cm-1)=(zm+1,zm+2,…,z2m) z1 z2 … zm

z2 z3 … zm+1

……………zm zm+1 … z2m-1

Therefore:

-1

Page 17: 1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques:

17

Example 1.14, page 37. Suppose LFSR 5 with the following:

Cryptanalysis of LFSR stream cipher --example

Ciphertext string: 101101011110010Plaintext string: 011001111111000

Then keystream: 110100100001010

Therefore initialization vector is: 11010.For next five key elements: 01000, set up equation for coefficients (c0,c1,c2,c3,c4) and solve it.

The result is: (c0,c1,c2,c3,c4) =(1,0,0,1,0)i.e., zi+5=(zi+zi+3) mod 2.