Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.
-
Upload
coleen-gardner -
Category
Documents
-
view
226 -
download
0
Transcript of Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.
![Page 1: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/1.jpg)
Section 2.7: The Friedman and Kasiski Tests
Practice HW (not to hand in)
From Barr Text
p. 1-4, 8
![Page 2: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/2.jpg)
• Using the probability techniques discussed in the last section, in this section we will develop a probability based test that will be used to provide an estimate of the keyword length used to encipher a message with the Vigenere cipher. We also develop another test designed to estimate the keyword length that is based on the coincidental alignment of letter groups in the plaintext with the keyword. We first develop some facts concerning probability of letters occurring in standard English.
![Page 3: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/3.jpg)
Probability of Selecting Multiple Letters in Standard English
• In the standard English frequency table for letters, the probability of selecting a single letter list is the relative frequency converted to decimal, that is
100
tablein the foundletter theoffrequency relative
English standardin
letter single a selecing
P
![Page 4: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/4.jpg)
Example 1: Using the standard English
frequency table, what is the probability of
selecting an E? A X?
Solution:
![Page 5: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/5.jpg)
Example 2: In a large sample of English text,
estimate the probability of selecting two E’s. Two
A’s.
Solution:
![Page 6: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/6.jpg)
For convenience, we will assign the variables
to represent the probabilities of selecting the
letters A, B, C, D, E, …, Y, Z from the standard
English alphabet. The subscripts of the variables
correspond to the MOD 26 alphabet assignment
number of the corresponding alphabet letter. We
will use this variable assignment in the next
example.
2524432 ,10 ,, , , , ppppppp
![Page 7: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/7.jpg)
Example 3: What is the probability that two
randomly selected English letters are the same?
Solution: Using the standard English frequency
table, we see that
![Page 8: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/8.jpg)
0.065
)00074.0(
)01974.0( )00150.0( )02360.0( )00978.0( )02758.0(
)09056.0( )06327.0( )05987.0( )00095.0( )01929.0(
)07507.0( )06749.0( )02406.0( )04025.0( )00772.0(
)00153.0( )06966.0( )06094.0( )02015.0( )02228.0(
)12702.0()04253.0()02782.0()01492.0()08167.0(
s) Z'2(s)Y' 2(s)C' 2(s)B' 2(s)A' 2(same theare
letters two
2
22222
22222
22222
22222
22222
225
224
22
21
20
ppppp
PPPPPP
█
![Page 9: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/9.jpg)
Friedman Test
• The Friedman Test is a probabilistic test that can be used to determine the likelihood that the ciphertext message produced comes from a monoalphabetic or polyalphabetic cipher. This technique of cryptanalysis was developed in 1925 by William Friedman.
![Page 10: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/10.jpg)
• If the cipher is a polyalphabetic Vigenere encipherment, Friedman’s test is also useful in approximating the length of the keyword used. To show how this works, we start with the following definition:
![Page 11: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/11.jpg)
Definition: Index of Coincidence.
Denoted by I, the index of coincidence represents
the probability that two randomly selected letters
are identical.
![Page 12: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/12.jpg)
Index of Coincidence for Monoalphabetic Ciphers
In monoalphabetic ciphers, the frequencies of
letters in standard English are preserved when
converting from plaintext to ciphertext. The
following example illustrates why this is true.
![Page 13: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/13.jpg)
Example 4: Illustrate why the Caesar shift cipher
preserves frequencies when converting from plain
the ciphertext.
Solution:
![Page 14: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/14.jpg)
Recall the index of coincidence represents the
probability that two randomly selected letters are
identical. Since monoalphabetic ciphers
preserves frequencies, the index of coincidence
of the plaintext alphabet of standard English will
be exactly the same as the index of coincidence
of the ciphertext alphabet for a monoalphabetic
cipher. Using the result from Example 3, this fact
results in the following statement:
![Page 15: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/15.jpg)
Index of Coincidence for Monoalphabetic Ciphers
065.0same theare letters
selectedrandomly Two
Ciphers eticMonoalphabFor
eCoincidenc ofIndex
PI
![Page 16: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/16.jpg)
Index of Coincidence for Polyalphabetic Ciphers
• In a polyalphabetic cipher, the goal is to distribute the letter frequencies so that each letters has the same likelihood of occurring in the ciphertext. The next example determines what the index of coincidence is for a polyalphabetic cipher for a large collection of letters.
![Page 17: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/17.jpg)
Example 5: Determine the probability that two
randomly selected letters are identical of the
ciphertext of a message enciphered with a
polyalphabetic cipher, assuming there are a very
large number of letters in the ciphertext.
Solution:
![Page 18: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/18.jpg)
![Page 19: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/19.jpg)
![Page 20: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/20.jpg)
Since the index of coincidence represents the
probability that two randomly selected letters are
identical, Example 5 allows us to make the
following statement:
Index of Coincidence for Polyalphabetic Ciphers
0385.0same theare letters
selectedrandomly Two
Ciphers eticPolyalphabFor
eCoincidenc ofIndex
PI
![Page 21: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/21.jpg)
The index of coincidence values for
monoalphabetic (0.065) and polyalphabetic
ciphers (0.0385) were derived assuming that the
plaintext message has a very large number of
letters. When messages are enciphered and
deciphered, these messages are normally
much shorter. Hence, the index of coincidence for
a typical enciphered message enciphered will be
bounded somewhere between 0.0385 and 0.065.
This leads to the following statement:
![Page 22: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/22.jpg)
Index of Coincidence Bound
For a typical ciphertext message, the index of
coincidence I satisfies:
Fact: If I is close to 0.0385, then the cipher is
likely to have been obtained from a
polyalphabetic cipher. If I is closer to 0.065, the
cipher is likely to be monoalphabetic.
065.00385.0 I
![Page 23: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/23.jpg)
Knowing what the value for the index of
coincidence tells us, we now need to derive a
formula for calculating it. Before doing this, we
need to recall the following fact concerning
summation notation:
![Page 24: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/24.jpg)
FactSummation notation is a shorthand notation in
mathematics for indicating the sum of many
terms. We say that
represents the sum of k terms , ,
where the index i starts at the first term (i = 1) and
we sum until we reach the upper index k of the
summation symbol.
k
iki aaaa
121
kaaa , , , 21
![Page 25: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/25.jpg)
Example 6: Compute
.
Solution:
5
1
2
i
i
![Page 26: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/26.jpg)
Derivation of Formula for the Index of Coincidence for a Given Ciphertext Message
Suppose a ciphertext message is received. Let
be the counts of the
number of occurrences of the alphabet letters A,
B, C, …, Y, Z that occur in the ciphertext (note
that the subscript of each variable corresponds to
the MOD 26 alphabet assignment number of the
corresponding letter). Suppose
2524210 , , , , , nnnnn
![Page 27: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/27.jpg)
represents the total sum of all of the letters in the
ciphertext. Recall that the index of coincidence
represents the probability that two randomly
selected letters are identical. Using the
multiplication principle of probability for n total
letters, we can compute the following probabilities
for each individual letter:
2524210 letters ofnumber Total nnnnnn
![Page 28: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/28.jpg)
)1(
)1(
)1(
)1(
selected
randomly are sA' 2 0000
nn
nn
n
n
n
nP
)1(
)1(
)1(
)1(
selected
randomly are sB' 2 1111
nn
nn
n
n
n
nP
)1(
)1(
)1(
)1(
selected
randomly are sC' 2 2222
nn
nn
n
n
n
nP
)1(
)1(
)1(
)1(
selected
randomly are sY' 2 24242424
nn
nn
n
n
n
nP
)1(
)1(
)1(
)1(
selected
randomly are s Z'2 25252525
nn
nn
n
n
n
nP
![Page 29: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/29.jpg)
Since these probabilities are mutually exclusive,
we can sum the individual probabilities for each
letters to find the index on coincidence.
We summarize this result.
25
0
25252424221100
25252424221100
)1()1(
1
)1()1()1()1()1()1(
1
)1(
)1(
)1(
)1(
)1(
)1(
)1(
)1(
)1(
)1(
same theare selected
randomly letters 2
eCoincidenc
ofIndex
iii nnnn
nnnnnnnnnnnn
nn
nn
nn
nn
nn
nn
nn
nn
nn
nn
P
![Page 30: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/30.jpg)
Formula for the Index of Coincidence
The index of coincidence I is given by the formula:
where n represents the total number of letters in the ciphertext and represents the number of letters corresponding to each individual letter in the ciphertext with , , etc.
)1()1()1()1()1()1(
1
)1()1(
1
25252424221100
25
0
nnnnnnnnnnnn
nnnn
Ii
ii
25 , 1, ,0 , ini
sA' ofnumber the0 n sB' ofnumber the1 n sC' ofnumber the2 n
![Page 31: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/31.jpg)
Example 7: Suppose we receive the message"HLUBN WFSFK IGIHM GBSIM MBSEJ MAFUT QECII LJSUB BAXMA JCWXC MBSGZ GGSMK BHUQB ETVUS MLMER CFDTW UBASW ERFIE LOMVY SIMMY YEDDM MSGZA NCOFY YTIHL JRYOH KLOFH IEFKQ OFAAI ZGIEJ HAKNZ JSQRU QXDKW HSNNF AOUMO ROFAA IZPIQ YHQFY SWEFK ILDPQ GIXUE ADFWN NFVYO TRXRG QKRUS HVYHA GYONT TZISI EPUOF XAZRN ZTSQK BGIIS MIMII SMIMX HAHUF ZNMFG WIMMB QWQLT SMZTU XRBSA EMFGW IHUAM WQFFV CKNSM TIJYY RCOJR ASBOE YHQAI KYPAK YJKUX VUFIG GBCFY HQKIJ QDFVU LBOGZ XTQOI MIMWH QOXUQ EMBIX KYAIB SAEFC UKPYA ILKJL RRIQT URSYD QUOYS OJLXR IQTUB IHC"
![Page 32: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/32.jpg)
Use the index of coincidence to decide if this
ciphertext was produced by a monoalphabetic or
polyalphabetic cipher.
Solution: Using the Friedman Maplet, we can
generate the following frequency table for the
letters in this message:
![Page 33: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/33.jpg)
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
This gives 4382524210 nnnnnn
total letters. Thus the index of coincidence is:
I =
25
0
)1()1(
1
iii nn
nn
= )]1()1()1()1()1([)1(
125252424221100
nnnnnnnnnn
nn
=
230 n 181 n 102 n 83 n 164 n 255 n 166 n 187 n 378 n 129 n 1710 n 1211 n 2912 n
1113 n 1814 n 515 n 2216 n 1617 n 2518 n 1419 n 2220 n 721 n 1222 n 1323 n 2124 n 1125 n
1011202112131112
6721221314242515162122451718101128291112
16171112363717181516242515167891017182223
437438
1
![Page 34: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/34.jpg)
=
=
Since 0.043186 is much closer to 0.0385 than 0.065, the cipher is likely polyalphabetic.
█
1104201561324246218260024046220
30611081213227213213323062406002405690306506
191406
1
043186.095703
4133
191406
8266)8266(
191406
1
![Page 35: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/35.jpg)
Using the Index of Coincidence to Estimate the Keyword Length for the Vigenere Cipher
• So far we have used the index of coincidence to determine whether a cipher is polyalphabetic or monoalphabetic. Once this is determined, it can be used to estimate the keyword length. Knowing the keyword length is the first essential step when attempting to break the Vigenere cipher. The following formula gives an estimate of the keyword length:
![Page 36: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/36.jpg)
Keyword Length Formula for the Vigenere Cipher
where
n = the number of letters in the ciphertext
message.
I = index of coincidence.
k = keyword length.
)0385.0(065.0
0265.0
InI
nk
![Page 37: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/37.jpg)
Example 8: For the ciphertext message given in
the previous example, use the index of
coincidence to estimate the keyword length.
Solution:
![Page 38: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/38.jpg)
The Kasiski Test
• The Kasiski test is another method that can be used to approximate the keyword length in the Vigenere cipher. The cipher was first published by a retired Prussian Army officer named Friedrich Wilhelm Kasiski in 1863. The Kasiski test had been independently discovered almost a decade earlier, in 1854, by the English inventor Charles Babbage .
![Page 39: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/39.jpg)
• The Kasiski test relies on the occasional coincidental alignment of letter groups in the plaintext with the keyword to give a keyword length estimate. The test says if a string of characters appears repeatedly in a polyalphabetic ciphertext message, it is possible (though not certain), that the distance between the occurrences is a multiple of the length of the keyword.
![Page 40: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/40.jpg)
• To demonstrate how this works, suppose the Vigenere cipher is used to encipher the message “THE CHILD IS FATHER OF THE MAN” using the keyword “POETRY” to produce the following ciphertext.
Plaintext T H E C H I L D I S F A T H E R O F T H E M A N
Keyword P O E T R Y P O E T R Y P O E T R Y P O E T R Y
Ciphertext I V E V Y G A R M L M Y I V E K F D I V E F R L
The keyword “POETRY” is six letters long. Note that the trigraph “IVE” occurs three times in the ciphertext. The second occurrence of “IVE” occurs 12 character positions after the first. The third occurrence of “IVE” occurs 6 character positions after the second.
![Page 41: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/41.jpg)
• This leads to the assertion that the separations of common letter occurrences stand a good chance of being multiple of the keyword. This observation leads to the following fact concerning the Kasiski test.
Fact: The greatest common divisor or divisor of it of the separations of common characters that occur in a ciphertext enciphered by the Vigenere cipher tends to be a good chance of being equal or at least some multiple of the keyword.
![Page 42: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/42.jpg)
• Since “IVE” was separated by 12 characters and then 6 characters, then by observing that gcd(6, 12) = 6, we see that we have hit exactly the number of letters that occurred in the keyword. We conclude with one more example illustrating the Kasiski test.
![Page 43: Section 2.7: The Friedman and Kasiski Tests Practice HW (not to hand in) From Barr Text p. 1-4, 8.](https://reader031.fdocuments.us/reader031/viewer/2022013115/56649ed35503460f94be34dd/html5/thumbnails/43.jpg)
Example 9: Using the Kasiski Maplet, estimate
the keyword length of the ciphertext given in
Example 7 applying the principles of the Kasiski
test.
Solution: Will demonstrate using the Kasiski
Maplet in class.
█