Compression Encryption

download Compression Encryption

of 34

Transcript of Compression Encryption

  • 7/31/2019 Compression Encryption

    1/34

    i

  • 7/31/2019 Compression Encryption

    2/34

    i

    Understanding the Raw Materialsof the Internet

    by

  • 7/31/2019 Compression Encryption

    3/34

    ii

  • 7/31/2019 Compression Encryption

    4/34

  • 7/31/2019 Compression Encryption

    5/34

  • 7/31/2019 Compression Encryption

    6/34

    vTable of Contents

    Table of Contents

    Why encryption and compression are important . . . . . . . . . . . . . . . . . . . . . . . . 1Compression saves money. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Encrypted data cant be compressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Encryption must follow compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    How compression works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Duplicate strings of characters replaced with tokens . . . . . . . . . . . . . . . . . . 4Compression speed is important . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    What data makes the smallest files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Electronic mail messages contain many compressible phrases . . . . . . . . . . . 6The HTML used for Netscapes homepage compressed at the rateof 5 to 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Types of compression programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Lossless vs. Lossy compression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    How encryption works. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Mathematical functions create ciphertext. . . . . . . . . . . . . . . . . . . . . . . . . 9

    Similarities between code breaking and compression. . . . . . . . . . . . . . . . . 10Brute force computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Breaking codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    Substitution codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Frequency patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Breaking codes with Microsoft Office . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Caesars code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Hiding letter frequencies with the Vigenere cipher . . . . . . . . . . . . . . . . . . 16

    Transposition codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 A known plaintext attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Contemporary Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Algorithms and key length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Symmetric and public keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    The Internets building blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    LZS compression, the de facto standard . . . . . . . . . . . . . . . . . . . . . . . . . . 25Encryption: Essential for the Internets growth . . . . . . . . . . . . . . . . . . . . . 26

  • 7/31/2019 Compression Encryption

    7/34

    vi Table of Contents

  • 7/31/2019 Compression Encryption

    8/34

  • 7/31/2019 Compression Encryption

    9/34

    Compression saves money

    2 Why encryption and compression are important

    Compression saves money

    The risks of poor computer security are obvious and sometimes dramatic.In 1995, for example, Kevin Mitnick stole more than 20 thousand creditcards from Netcom, Inc. Mitnicks computer invasions spurred an intenseinterest in security, which has been compared to the effect of the Sputnik satellite in 1957 on American education. Out of this concern, threeInternet Security protocols have emerged: Point-to-Point protocol (PPP);Secure Socket Layer (SSL) and Internet Protocol Security (IPSec). All of these protocols support data compression.

    The Internet is safer today than it was two years ago. But business users areequally at risk of being robbed when they send uncompressed data acrossthe Internet. Compression saves on computer processing power, memory and transmission costs, all of which are ultimately passed onto to the user.The reasons for performing compression are simple, and they are allpreceded with dollar signs.In this primer, we discuss compression before encryption,and not just toobserve alphabetical order. For reasons that will become clear shortly,compressionmust

    be performed before encryption, or it cant beperformed at all.Compression, even in modest amounts, produces big savings. Mostbusinesses connect to the Internet not through modem lines like homeusers but through faster and more expensive data links such as ISDN. Asthe business grows, and the amount of its data increases, the company must buy more equipment and more telephone service. Since it is notpossible to buy part of an ISDN line, for example, the businesss expenses will double even if the companys need for additional capacity, orbandwidth, is only slightly more than the capacity of a single line. Thesituation is like a letter that weighs 1.1 ounces; if you cant get the weightbelow an ounce, you have to buy two stamps. Very large organizationsoften nd themselves in the position when they outgrow their high-capacity T1 or T3 lines.

  • 7/31/2019 Compression Encryption

    10/34

    Encrypted data cant be compressed

    3Why encryption and compression are important

    Encrypted data cant be compressed

    Compressing and encrypting data obviously makes good business sense.However, these functions cannot be performed interchangeably.Compressionmust

    be performed before encryption. It is impossible tocompress encrypted data.Compression depends upon nding patterns within messages that can berepresented by shorter symbols, called tokens. In contrast, encryption

    removes patterns from messages. For this reason, compression must beperformed rst.Indeed, one of the tests used by professional code breakers is to try tocompress a secret message. If the message is compressible, the encryptionformula that produced it fails the test. Attempting to compress properly encrypted data not only fails to make thedata smaller; it can actually cause the le to grow. Compression may adddata to the le that is not compensated for by any reduction in le size. While this data expansion problem can be avoided through good designpractices, an encrypted le never gets any smaller and may even get bigger.

    Encryption must follow compression

    While you cannot compress dataafter

    encryption, it is quite possible, anddesirable, to compress databefore encryption. Compression not only saveson transmission costs; it also saves on the costs of encryption. If you

    compress a le to half of its former size, encryption will use half as muchprocessing power.

    Type of Services Approx. Monthly Cost ofTelephone Service Max. Data Transfer Rate(kilobyte per second)

    Modem $20 56K

    ISDN $50 128K

    T1 $500-$1000 1,544K

    T3 $10,000 and up 44,736K

  • 7/31/2019 Compression Encryption

    11/34

    Duplicate strings of characters replaced with tokens

    4 How compression works

    Encryption performs complex mathematical operations on each block of data, requiring lots of computational power. Smaller les save money because they encrypt faster. To be sure, compression also requiresprocessing power, but dedicated compression chips can perform thisfunction much less expensively than general-purpose microprocessors.The newest approach to compression and encryption combines bothfunctions on the same chip. This guarantees that the functions areperformed in the right order (compression rst, encryption second), andreduces even more the demand on the main processor.

    How compression works

    Duplicate strings of characters replaced with tokens

    Compression works by replacing repeating strings of characters withshorter tokens. For example, the following message uses the same phrase inthree different places.

    Uncompressed

    Page 1 Page 2 Page 10

    The IBM Corporationis a large corporation.

    The IBM Corporation is aprotable organization.

    The IBM Corporation isa force in world chesscompetition.

    Compressed

    The IBM Corporationis a large corporation.

    (40,25) protableorganization.

    The IBM Corporation isa force in world chesscompetition.

    The 1st appearance of

    the phrase is notcompressed.

    The text from page two iscompressed using the token (40,25). The tokenmeans Go back 40characters and get thenext 25 characters.

    The text from page 10 isnot compressed

    because it is outside the2-page window ofcompression

  • 7/31/2019 Compression Encryption

    12/34

    Compression speed is important

    5How compression works

    Compression speed is important

    In Hifns LZS compression, the repetitive characters must be within about2,000 characters (2,048 bytes) of each other. Otherwise, they will bepassed along as uncompressed text.Why 2,000 characters? Wouldnt alarger chunk of text produce more compression? The answer is yes, a larger window would produce somewhat more compression but at a cost of slower performance. The point of compression is to speed up the Internet,not slow it down. And since compression must be done on the y invery fast-moving computer networks, performance is crucial.

    Extensive Hifn testing has found that a sliding window of about twodouble-spaced pages of text is best in terms both of compression efciency and time required to compress. Repeated searching for redundant stringsof data can also deliver more compression, but at the expense of speed.Because of its speed, LZS compression, whose patents are owned by Hifn,is the de facto standard for compression among the major hardware andsoftware companies who are building the Internet.The table below shows the difference in compression ratios andcompression speed when Thomas Hardys 141,000-word novelFar fromthe Madding Crowd

    is compressed using LZS and WinZip.

    The table shows that WinZip makes the le smaller but takes ve times aslong. For personal use, the time taken probably doesnt matter. However, when merging onto the Information Super Highway, speed is all-important.The data stream must be able to travel at the maximum data rate of thetelephone line. Most large companies and organizations use a T1 line, which can send data at the rate of 1.54 million bits, or about 180,000

    LZS v WinZip

    CompressionProgram

    Original Size(in bytes)

    Compressed Size(in bytes) Ratio

    Time toCompress

    LZS 785K 458K 1.7:1 1 second

    WinZip 785K 322K 2.4:1 5 seconds

  • 7/31/2019 Compression Encryption

    13/34

    What data makes the smallest files

    6 How compression works

    bytes, per second. At the rate of 157,000 bytes, per second, data bubbles would quickly build up in the data stream. These data bubbles waste T1capacity, which translates to wasted money.

    What data makes the smallest les

    Highly repetitive documents can be compressed more than documents with fewer redundant character strings. Electronic mail messages and webpages created in Hypertext Markup Language (HTML) are highly compressible (see examples below). On the other hand, encrypted data, which resembles page after page of random numbers, cannot becompressed at all. Previously compressed data sometimes can becompressed further, although usually not very much.

    Electronic mail messages contain many compressible phrases

    From - Thu Aug 07 15:43:14 1997Received:from mailman.hifn.com (

    mailman.hifn.com

    [206.19.120.66]) by interstice.com (8.8.6/8.6.9) with SMTP idNAA29787 for ; Thu, 7 Aug 1997 13:45:08-0700 (PDT)Received: by mailman.hifn.com

    with SMTP (Microsoft ExchangeServer Internet Mail Connector Version 4.0.994.63)

    id ;Thu, 7 Aug 199713:37:43 -0700Received:from smtp2.cerf.net by

    mailman.hifn.com

    with SMTP(Microsoft Exchange Internet Mail Connector Version4.0.994.63)

    id SBC2RB16;Thu, 7 Aug 1997 13:37:37 -0700Received:from

    interstice.com

    (inter2.

    interstice.com

    [209.50.32.201]) by smtp2.cerf.net (9.9.8/8.6.10) with ESMTP

    id NAA21231;Thu, 7 Aug 1997 13:35:24 -0700 (PDT)Received:from int-226-70.

    interstice.com

    (int-226-70.

    interstice.com

    [205.199.226.70]) by interstice.com

    (8.8.6/8.6.9) with SMTP id NAA28636;Thu, 7 Aug 1997 13:01:51 -0700(PDT)Received: by int-226-70.

    interstice.com

    with Microsoft Mailid ;Thu, 7Aug 1997 13:12:08 -0700Message-ID:X-UIDL:870986870.001

  • 7/31/2019 Compression Encryption

    14/34

  • 7/31/2019 Compression Encryption

    15/34

  • 7/31/2019 Compression Encryption

    16/34

    Lossless vs. Lossy compression

    9How encryption works

    Lossless vs. Lossy compression

    Lossless compression works the way you would expect. No data is lostduring compression and decompression. You would not want your bank statement, for example, to lose a deposit or two during compression. Hifnis in the lossless compression business.Lossycompressionis

    used for pictures and sounds. In this type of compression, data is lost during compression. This is possible becausecomputers store many more colors and sounds than human eyes and earscan see or hear. Lossy compression can safely throw away much of theoriginal data during compression. The decompressed les look and soundne.

    How encryption works

    Mathematical functions create ciphertext

    Encryption works by applying mathematical functions to ordinary text sothat it is apparently changed beyond recognition. But mathematicalfunctions work in both directions. The original text can be restored,provided you know the key used during encryption. Heres a very simpleexample: Let the letter A be represented by the number 65 (as it is in fact

    represented by virtually all computer systems). Multiply 65 by the key of 5 to get 325. Since 325 is not generally known

    to represent A, it is now secret. To convert the number 325 back to 65, divide by the key (5).

    Term Denition

    1. Cipher A secret code

    2. Encrypt (or encipher) Change plaintext to ciphertext

    3. Decrypt (or decipher) Change ciphertext to plaintext

    4. Cryptology Art of secret writing or making codes5. Cryptanalysis Code breaking

  • 7/31/2019 Compression Encryption

    17/34

  • 7/31/2019 Compression Encryption

    18/34

  • 7/31/2019 Compression Encryption

    19/34

    Substitution codes

    12 Breaking codes

    Breaking codes

    All of the examples in this section are taken from ciphers developedhundreds, if not thousands, of years ago. The most recent techniquediscussed was rst published in 1918, well before the invention of programmable electronic computers. Modern encryption is far morecomplex. Whether simple or complicated, however, almost all encryptionmethods usesubstitutionor transposition, or both.

    Substitution codesThere is a simple code used in the movie 2001: A Space Odyssey . Thecomputer in this movie, named HAL, was really IBM. To encipher amessage in this code, just substitute each letter of the alphabet with theone preceding it in the alphabet (Bbecomes A;C becomes B. . .A becomes Z).Heres what the previous paragraph looks like in the HAL ciphertext:

    SGDQD HR Z RHLOKD BNCD TRDC HM SGD LNUHD ZROZBD NCXRRDX SGD BNLOTSDQ HM SGHR LNUHD MZLDCGZK VZR QDZKKX HAL SN AQDZJ SGHR BNCD ITRS RTARSH-STSDDZBG KDSSDQ NE SGD ZKOGZADS VHSG SGD NMDADGHMC HS HM ZKOGZADS Z ADBNLDR Y SGHR HR ZUZQHZMS NE NMD NE SGD NKCDRS JMNVM BNCDR TRDCAX ITKHTR BZDRZQ BZDRZQ R BNCD RGHESDC DZBG KDSS-DQSGQDD SN SGD KDES HMRSDZC NE ITRS NMBD

    This code is easy to use, and easy for your friends to decipher.Unfortunately, it is also extremely easy for your enemies to decipher. TheHAL code uses the simple formula, Ciphertext=Plaintext+1.

  • 7/31/2019 Compression Encryption

    20/34

    Frequency patterns

    13Breaking codes

    Frequency patterns

    Nevertheless, if you are new to codes, cracking this simple cipher is a good way to understand the importance of producing ciphertext that looks likerandom letters or numbers. The HAL cipher has a discernible non-random pattern that makes it vulnerable to attack.Heres how to analyze this passage. Count the number of times each let-terappears in the ciphertext. These frequency patterns are an important clue.Here is a partial distribution of letters in the above passage:

    You probably can see at a glance that this is not the normal distribution of letters in English words. In the game of Scrabble, there are 12 E tiles,more than any other letter. Why is E not in the top ve? And why doesZ appear eight times more frequently than I?If the 390 letters were distributed randomly, each block of ve would

    appear 75 times. Instead, the top ve letters appear 146 times while thebottom ve appear only 12 times. The human meaning behind this codebetrays itself by its non-random distribution of letters.It is reasonable to guess that Dstands for E, since Eis the mostcommon letter in English.

    Five most common letters Five least common letters

    Letter Occurence Letter Occurence

    D 46 I 3

    S 32 U 3

    R 25 V 3

    Z 23 J 2

    H 20 Y 1

  • 7/31/2019 Compression Encryption

    21/34

    Frequency patterns

    14 Breaking codes

    The second most common letter in English is T. While this is not asobvious a guess, lets try it anyway. Here is what the passage looks likenow:TGEQE HR Z RHLOKE BNCE TREC HMTGE LNUHE Z ROZBE NCXRREXTGE BNLOTTEQ HMTGHR LNUHE MZLEC GZK VZR QEZKKX HALTN AQEZJ TGHR BNCE ITRT RTARTHTTTE EZBG KETTEQ NETGE ZKOGZAET VHTG TGE NME AEGHMC HT HM ZKOGZAET Z AEBNLER YTGHR HR Z UZQHZMT NE NME NETGE NKCERT JMNVM BNCER TREC AX ITKHTR BZERZQ BZERZQ R BNCE RGHETEC EZBG KETTEQTGQEE TN TGE KEET HMRTEZC NE ITRT NMBE

    A. Microsoft Word. Almost any word processor can beused to search and replaceciphertext to crack a substitutioncode. However, after you have

    partially solved a message, it canbe difcult to avoid accidentally replacing the plaintext letters asyou continue to search forciphertext. For example, after youhave replaced all the Ds with Es in the example exercise, you want to be able to substitute Ffor E, without also changing therealEs to Fs. Word 7 allows youto search and replace by format,so you can replace regular text with bold text as we have done inthe examples in this booklet.1. Go to Edit-Replace.2. Click on the More button to

    expand the dialog box.

    3. Enter D in the Find Whatbox.

    4. Select the D in the Find Whatbox and click the Formatbutton.

    5. Select Fontand chooseregular. Click OK.6. Enter E in the Replace What

    box and click the Formatbutton.

    7. Select Fontand choose bold.Click OK.

    B. Microsoft Excel. The MID function will break upany text string into one characterper cell: MID($A$1,B1,1).Theoriginal text goes into cell A1;Column B1 contains a range of numbers from one to the largestestimated number of characters in

    A1. The number of Es can becounted with the statement,COUNTIF(C1:C1000,e).

    Breaking codes with Microsoft Ofce

  • 7/31/2019 Compression Encryption

    22/34

  • 7/31/2019 Compression Encryption

    23/34

    Caesars code

    16 Breaking codes

    The key is 1.

    Once you know that X=1, you can decipher hundreds of pages of ciphertext easily. Your computer can do it in a blink of an eye. You donthave to puzzle out half-deciphered words like AKOHAAET (alphabet).

    Caesars code

    The HAL Substitution cipher is one of the oldest in the world. It wasused by Julius Caesar to send orders and messages to his legions more than2,000 years ago. Caesar used the key of three (A=D, B=E,etc.). Like a

    master passkey, the key to a cipher opens all of its doors. It is possible tomake up substitution ciphers that are considerably harder to crack thanthese very easy examples. One thing you can do is hide word lengths by putting all the text in ve character blocks, like this:ITRTR TARTH TTTEE ABHKE TTERN ETHEA KOHAA ETVHT HTHEN MEAEH HMCHT

    Another way to make a better substitution cipher is to use randomnumbers instead of simply rotating through the alphabet: Breaking thiscipher would require you to solve for all 26 letters, since there is noobvious pattern, such as A = B; C = D; E = F. . . . Here E isrepresented by the 21 st letter of the alphabet, U. Interestingly enough,T is represented by the 20 th letter, which is also T.

    While this is a somewhat harder code to crack, a professionalcryptographer would pounce on all those Us and Ts, even without acomputer, using the underlying frequency pattern of the letters to tear thiscipher open like a can of sardines.

    Hiding letter frequencies with the Vigenere cipher

    An adaptation of the Caesar Code that does hide letter frequency is theVigenere cipher, which uses the following table to producemultiple one-letter keys:

    9 1 13

    15

    21

    5 7 2 8 26

    10

    18

    14

    23

    3 22

    16

    24

    25

    20

    11

    4 6 19

    12

    17

    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

  • 7/31/2019 Compression Encryption

    24/34

    Hiding letter frequencies with the Vigenere cipher

    17Breaking codes

    The rst row uses a Caesar shift of 0; the second a shift of 1 and the last ashift of 25. To use this table, rst choose a keyword or phrase such asheartburn.

    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

    A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

    B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A

    C C D E F G H I J K L M O N P Q R S T U V W X Y Z A B

    D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

    E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D

    F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E

    G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F

    H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G

    I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H

    J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I

    K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J

    L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K

    M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L

    N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M

    O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N

    P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O

    Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P

    R R S T U V W X Y Z A B C D E F G H I J K L M N O P Q

    S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R

    T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S

    U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T

    V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U

    W W X Y Z A B C D E F G H I J K L M N O P Q R S T Y V

    X X Y Z A B C D E F G H I J K L M N O P Q R S T U V WY Y Z A B C D E F G H I J K L M N O P Q R S T U V W X

    Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y

    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

  • 7/31/2019 Compression Encryption

    25/34

    Hiding letter frequencies with the Vigenere cipher

    18 Breaking codes

    Next, write the keyword above the message without spaces, repeating it asnecessary. Finally, encrypt each letter of the message, To be or not to beby locating the intersection of the plaintext letter and the keyword letter:HE A R T B UR NHE A R

    TO B E O R N O TTO BE

    AS B V H S HF F AS B V

    The Vigenere Cipher is considerably more difcult to break than single-key substitution ciphers, especially if all you have to attack it with is apencil and paper. Notice, for example, that the two Fs and the two Hs donot stand for the same letter. On the other hand, the two Bs do stand forthe same letter, which happens to be B itself.Standard frequency analysis as we used with the HAL cipher will notreveal the Es,Ts, As,Os and Ns.However, this cipher was invented by Blaise de Vigenere in the 16thCentury and was demolished by modern cryptographers long ago. TheVigenere Cipher does not produce truly random ciphertext because itrepeats itself every time the key repeats. You can see how a two-letter key produces more repetition, and less security, than a nine-letter key. U.S.export laws measure the strength of encryption programs primarily by looking at the maximum key length supported by the program. It is illegalto export encryption with keys that are longer than 40 bits in length without a special export license.The secret to breaking the Vigenere Cipher is learning the length of thekey. Suppose you discover that the key is nine characters long. Then youcan analyze every ninth letter just as you did when you broke the HALcipher. In other words, the letters in the series,1, 10, 19, 28 will makeup a set of letters where the most common and least common letters willreveal themselves. Similar analysis can be performed on letters 2, 11, 20,29and so on.But how do you discover the length of the key? Applying something calledthe Index of Coincidence can do this. You do this by splitting theciphertext in two blocks and counting the times that the letter in theupper block is the same as the letter in the lower block. In English

  • 7/31/2019 Compression Encryption

    26/34

  • 7/31/2019 Compression Encryption

    27/34

  • 7/31/2019 Compression Encryption

    28/34

    Transposition codes

    21Breaking codes

    OH: S MS DR S S CT WZ

    A DE S P E T UHHI Z

    E A E UM A OC B E T Z

    S M YT OHL OS L HZ

    I UOS . E V A L B DT E A Z

    TSSVP RILYREITLTHIEIA E,E,TTPHEMDECTI,A UEHERP EHN IKJTRA

    ELI2 E NWB UE BO EN0O TAAMTS OENI 0DCHMS.HTEFTESCT1YOIE I A Z OH:SMSDSSCTWZADE SP ET UHHIZ E AEUMAOCB ETZS M YTOHL OSL HZIUOS.EVALBDTE

    To attack a transposition cipher, begin as you did with the Caesar Cipher,by analyzing the letter frequency:The table shows that Eappears 24 times,Tappears 17 times, and thatthat Qand Xdo not appear at all. This conforms both to our

    A 11 N 4

    B 4 O 9

    C 5 P 4D 5 Q 0

    E 24 R 5

    F 1 S 12

    G 0 T 17

    H 11 U 5

    I 11 V 2J 1 W 2

    K 1 X 0

    L 6 Y 3

    M 6 Z 6

  • 7/31/2019 Compression Encryption

    29/34

    A known plaintext attack

    22 Breaking codes

    Scrabble-playing experience and to the known incidence of letters inEnglish writing,which are listed here in order from the most common tothe least:

    Even a small amount of ciphertextin this case, just 200 charactersisenough to spot the telltale frequency patterns of the English language.

    This letter frequency strongly suggests a transposition, rather than asubstitution cipher.

    A known plaintext attack

    Transposition ciphers can be hard to crack, especially if you are armed only with graph paper. But in this example, you have a big advantage. You canlaunch a Known Plaintext Attack against the message. This means youknow both the plaintext and the ciphertext of a single paragraph taken

    from this primer. You can use that knowledge to recover the key, whichyou can use to decode not just one paragraph, but the entire primer.Knowing that the rst words of the message are There is a simplecode count the characters between the rst instance of a T and anH. There are 16. Now count 16 characters from the H. Sure enough,you come to an E. If you count 16 characters from the E, you nd anR. Clearly, this is no coincidence.Create a box on your graph paper that is 16 boxes wide. Write theciphertext, one character per box. Reading downward instead of acrossexposes the plaintext. What if you dont know any plaintext? Well, then you have to use sometrial and error, but its not as difcult as you might think. A dedicatedcode-breaking program can solve much more difcult transpositionciphers than this at the speed of light, but even commonly available toolssuch as Microsoft Word can automate your cryptanalysis considerably. In

    Word, use the Text to Tableand Table to Text commands to try outdifferent numbers of columns until you nd the right one. If you how torecord macros, you can easily print out page after page of possible tables.

    Standard English ETAONRISHDLFCMUGPYWBVKXJQZ

    Ciphertext ETSAHIOLMCDRUBNPYVWFJKGQXZ

  • 7/31/2019 Compression Encryption

    30/34

  • 7/31/2019 Compression Encryption

    31/34

    Symmetric and public keys

    24 Contemporary Encryption

    It hardly needs saying that numbers of these sizes, and the computingpower necessary to manipulate them, have resulted in ciphers that arealmost unimaginably more complex than the manual ciphers studied inthis pamphlet. At the end of the day, however, these machine-producedciphers use principles of substitutionand transposition that have theirroots in classical cryptography. And, since the code breakers have the sameadvanced tools that are available to the code makers, it is fair to say thatthe game is still denitely afoot.

    A cardinal principle of modern cryptography is that all security shouldrest in the key. This means that the inner workings of an algorithm suchas DES can be studied and discussed publicly and still produce ciphertextthat is unreadable to anyone who does not have the key.

    Symmetric and public keys

    In all of the ciphers we have experimented with, the key used to encryptand decrypt was the same. This is calledsymmetric key encryption or secret key encryption. If you and I want to use symmetric key encryption, wemust agree on a single key and keep it secret. If you let someone see yourkey, then my security is compromised along with yours.

    Asymmetric , or public key encryptionuses two pairs of keys, one public oneprivate. I give you my public key so you can send me an encryptedmessage. I use my private key to read your message.In turn, you give me your public key so I can reply to your message. Youuse your private key to decipher my message. The advantage of thisapproach is that if either public key is intercepted, it can only be used tosend an encrypted message. It cannot be used to decipher a message.Furthermore, if I give away my private key, it does not affect your security.

    Plaintext I B M

    Decimal Equivalent 73 66 77

    Binary Equivalent 01001001 01000010 01001101

    Binary Ciphertext 01001110 01000101 01001010

    Decimal Ciphertext 78 69 74

    Ciphertext N E J

  • 7/31/2019 Compression Encryption

    32/34

  • 7/31/2019 Compression Encryption

    33/34

    Encryption: Essential for the Internets growth

    26 Conclusion

    Encryption: Essential for the Internets growth

    Encryption is crucial if you plan to buy or sell on the Internet. Whilestandards are not as clear as in the case of compression, three emergingsecurity protocols are SSL, IPSec and PPP. Each of these standardssupports data compression, which must always occur before encryption. Authentication is a closely-related function that protects electronicmessages so you can be sure they have not been altered.Encryption uses substitution and transposition to hide meaning. Thesetechniques are at least as old as Caesars Roman Legions. Moderncomputer-assisted encryption methods produce ciphertext that has nodiscernible pattern. This ciphertext cannot be compressed because thecharacters appear to be randomly chosen. Known frequency patterns of the English language, including the index of coincidence, are completely disguised.Modern cryptanalysts are frequently able to mount a known plaintextattack. Strong encryption, which includes key lengths of 56 or more bits,should be able to resist such an attack.Security is not the only design goal of modern computer systems.Efciency and performance are also crucial. Hifns latest generation of products integrate compression, encryption and authentication on a singlechip.

  • 7/31/2019 Compression Encryption

    34/34