REPORT.pdf

35
CONTENTS CHAPTER 1 INTRODUCTION CHAPTER 2 LITERATURE REVIEW 2.1 Introduction 2.2 Modern Cryptosystem 2.3 Information Theoretic Approach CHAPTER 3 SPURIOUS KEYS AND UNICITY DISTANCE 3.1 Introduction 3.2 Spurious keys Analysis 3.3 Spurious Keys Analysis Using Natural Language Model 3.3.1. Introduction 3.3.2. Code Points Mapping Technique CHAPTER 4 TEXT-SPACE BOUNDARY IN STREAM CIPHERS 4.1 Introduction 4.2 Mathematical Modeling 4.2.1 Code-point Multimapping 4.2.1.1 Spurious Keys Analysis Using Natural Language Model 4.3 Index Mapping 4.4 A Security Model for Numeric-String CHAPTER 5 CONCLUSIONS

Transcript of REPORT.pdf

  • CONTENTS

    CHAPTER 1 INTRODUCTION

    CHAPTER 2 LITERATURE REVIEW

    2.1 Introduction

    2.2 Modern Cryptosystem

    2.3 Information Theoretic Approach

    CHAPTER 3 SPURIOUS KEYS AND UNICITY DISTANCE

    3.1 Introduction

    3.2 Spurious keys Analysis

    3.3 Spurious Keys Analysis Using Natural Language Model

    3.3.1. Introduction

    3.3.2. Code Points Mapping Technique

    CHAPTER 4 TEXT-SPACE BOUNDARY IN STREAM CIPHERS

    4.1 Introduction

    4.2 Mathematical Modeling

    4.2.1 Code-point Multimapping

    4.2.1.1 Spurious Keys Analysis Using Natural Language Model

    4.3 Index Mapping

    4.4 A Security Model for Numeric-String

    CHAPTER 5 CONCLUSIONS

  • CHAPTER 1

    INTRODUCTION

  • INTRODUCTION

    Cryptography is used everywhere for secured communication. It basically implements

    algorithm to encrypt and decrypt data using a secret key. The difficulty in retrieving the original

    message from its corresponding cryptogram without knowing the secret key determines level of

    security offered by the system. The strength of cryptography can be measured by understanding the

    need of computational height to break the algorithm and via information theoretic approach.

    Complexity of algorithm and size of key are the two factors determining the strength of

    cryptography but tradeoff with the performance parameter of the system. Information theoretic

    approach deals with different statistical parameters to measure the strength of cryptography on the

    basis of property of text-space.

    The characteristics of the encrypted/decrypted texts is a matter of concern regarding modern

    crypt-system. The plain text encrypted using modern cryptographic algorithm produces a cipher text

    which has no language property or readability and can easily be marked. Similarly, when a cipher

    text is decrypted using a incorrect key, the decrypted output simply doesn't look like a text. This

    characteristics assists adversaries to easily mark the secret text which is encrypted and comfortably

    proceed towards the unique solution by elimination of incorrect keys. There exists certain set of

    keys within the key-space which gives text-like-text as decrypted output and these keys are spurious

    keys. Spurious keys can't be easily eliminated by cryptanalysis. The attacker require certain

    background information or other means to deal with these spurious keys. The set of spurious keys

    associated with the security transaction uplifts the level of secrecy in a secure communication.

    The number of spurious keys is larger for small size message texts. The set of spurious keys

    gradually becomes narrower as the size of the text increases. The is a point to the size at which the

    probability of spurious keys is zero. It obtained by probabilistic approach is defined as the unicity

  • distance. Unicity distance give size of the text for which it is likely to be a single intelligible plain

    text decryption when a brute-force attack is attempted. If the unicity distance tends to infinity then

    the cipher is practically unbreakable with the brute-force technique. The unicity distance is

    inversely proportional to redundancy of the text-space. The redundancy of text-space deals with the

    those elements in the text space which when eliminated from the space does not effect the

    informational value associated with texts corresponding to that text space. For example, for

    alphanumeric texts all 1-byte data other then associated with alphanumeric characters are

    considered as redundant data. The redundancy also deals with the characteristics of language; some

    of the content when filtered from message still allows to retain the complete information of the

    message.

    The number of spurious keys remains a strong factor to strengthen cryptography for a short

    messages. According to Shannon's theory, if the cipher text is equiprobable to all message texts

    then the system is a perfect system. It means, if the number of spurious keys is fairly large then the

    system tends to perfection. The use of natural language model helps to strengthen cryptography

    which has been discussed in many papers. Natural language used for encryption by using utf-8

    encoding standard for implementation may lead to depreciation of spurious keys because of the 3-

    bytes weight of the character code-points. A fair implementation would be code-points mapping

    technique to implement the language model. Code-point is a unique number assigned to the atomic

    character of the script. The number reflect the information regarding the type of the language and

    the information about the specific character. The part of the information dealing with the script say

    language tag, can be extracted out which remains common to all characters of the text. And, the part

    which deals with the specificity of the character is mapped to a 1-byte value which is encrypted.

    Cipher text is obtained by inserting the language tag to the encrypted data. The process is reverted

    for decryption process. This approach helps to fairly analyze the natural language model and trace

    out the strength of cryptography. Unicity distance is comparatively bigger for natural language

  • model compared to English.

    The plain text space is generally limited to the characters associated with the language but

    the encrypted and decrypted texts can float through any 1-byte value as-far-as modern cryptosystem

    is concerned. This leads to a depreciation of spurious keys. If a boundary could be set to the

    encrypted-decrypted text space then the possibility of having text-like-texts at random encryption

    and decryption would fairly increase. Cryptanalysis of One-time-Pad using number of plaintext

    cihertext pairs gives a vision to shrink the boundary of the encrypted-decrypted text space for

    stream ciphers. This comes up with a fairly longer unicity distance for cryptosystem. The possibility

    of boundary to text spaces can uplift modern cryptosystems to a level of beyond cipher-text only

    attack.

  • CHAPTER 2

    LITERATURE REVIEW

  • 2. LITERATURE REVIEW

    2.1 INTRODUCTION

    Several issues are growing up regarding the secrecy of information that travel in a huge mass

    over the internet throughout the world. Security of information has become a public demand with

    issues related to mass surveillance and attacks that has been revealed to the world. Cryptography is

    a major tool concerned with security over internet network. It exists everywhere as far as the

    secrecy is concerned.

    There has been explosive growth in unclassified research for issues related to strengthening

    cryptography. Different approaches are available to measure the strength of either block ciphers or

    stream ciphers. Cryptanalysis techniques are basically used to estimate the strength of the algorithm.

    Many crypt-systems that are thought to be secured were broken and a variety of tool that are useful

    in cryptanalysis are developed The language to describe the security system basically relies on the

    discrete probability.

    2.2 MODERN CRYPTOSYSTEM

    Most of the ciphers that have been examined are not really cryptographically secure,

    although they may have been adequate prior to the widespread availability of computers. A

    cryptosystem is called secure if a good cryptanalyst, armed with knowledge the details of the cipher

    (but not the key used), would require a prohibitively large amount of computation to decipher a

    plaintext. This idea (that the potential cryptanalyst knows everything but the key) is called

    Kerckhoff's Law or sometimes Shannon's Maxim.

  • Kerckhoff's Law at first seems unreasonably strong rule; given a mass of encrypted data,

    how is the cryptanalyst to know by what mean it was encrypted? Today, most encryption is done by

    software or hardware that the user did not produce himself. One can reverse engineer a piece of

    software (or hardware) and make the cryptographic algorithm apparent, so we cannot rely on the

    secrecy of the method alone as a good measure of security. For example, when you make a purchase

    over the internet, the encryption method that is used between our browser and the seller's web

    server is public knowledge. The security of the transaction depends on the security of the keys.

    There are several widely used ciphers which are believed to be fairly secure. Probably the

    most commonly used cipher was DES (the Data Encryption Standard), which was developed in the

    1970s and has been adopted as a standard by the US government. The standard implementation of

    DES operates on 64-bit blocks (that is, it uses an alphabet of length 264-- each ``character'' is 8

    bytes long), and uses a 56-bit key. Unfortunately, the 56-bit key means that DES, while secure

    enough to thwart the casual cryptanalyst, is attackable with special hardware by governments, major

    corporations, and probably well-heeled criminal organizations. One merely needs to try a large

    number of the possible 256 keys. This is a formidable, but not insurmountable, computational effort.

    A common variation of DES, called Triple-DES, uses three rounds of regular DES with three

    different keys (so the key length is effectively 168 bits), and is considerably more secure. DES was

    originally expected to be used for 'only a few years' when it was first designed. However, due to its

    surprising resistance to attack it was generally unassailable for nearly 25 years. The powerful

    methods of linear and differential cryptanalysis were developed to attack block ciphers like DES.

    In the January of 1997, the National Institute for Standards and Technology (NIST) issued a

    call for a new encryption standard to be developed, called AES (the Advanced Encryption

    Standard). The requirements were that they operate on 128-bit blocks and support key sizes of 128,

  • 192, and 256 bits. There were five ciphers which advanced to the second round: MARS, RC6,

    Rijndael, Serpent, and Twofish. All five were stated to have ``adequate security''- Rijndael was

    adopted as the standard in October of 2000.

    Other commonly used ciphers are IDEA (the International Data Encryption Algorithm,

    developed at the ETH Zurich in Switzerland), which uses 128-bit keys, and Blowfish (developed by

    Bruce Schneier), which uses variable length keys of up to 448 bits. Both of these are currently

    believed to be secure. Another common cipher, RC4 (developed by RSA Data Security) can use

    variable length keys and, with sufficiently long keys, is believed to be secure. Some versions the

    Netscape web browser used RC4 with 40 bit keys for secure communications. A single encrypted

    session was broken in early 1995 in about 8 days using 112 networked computers; later in the same

    year a second session was broken in under 32 hours. Given the speed increases in computing since

    then, it is reasonable to believe that a 40-bit key can be cracked in a few hours. Notice, however,

    that both of these attacks were essentially brute-force, trying a large fraction of the 240 possible

    keys. Increasing the key size resolves that problem. Nearly all browsers these days use at least 128-

    bit keys.

    2.3 INFORMATION THEORETIC APPROACH

    A cryptosystem has perfect secrecy if for any message x and any encipherment y, p(x|

    y)=p(x). This implies that there must be for any message, cipher pair at least one key that connects

    them. According to Shannon's theory , Suppose a cryptosystem with |K|=|C|=|P|. The cryptosystem

    has perfect secrecy if and only if

    each key is used with equal probability 1/|K|,

    for every plaintext x and ciphertext y there is a unique key k such that e_k(x)=y.

    There are certain issues which has to be addressed theoretically concerning the strength of a

  • cryptosystem which is mentioned in point below:

    The immunity of a system to cryptanalysis when the cryptanalyst has unlimited time and

    manpower available for the analysis of cryptograms

    Does a cryptogram have a unique solution (even though it may require an impractical amount of

    work to find it),

    How much text in a given system must be intercepted before the solution becomes unique,

    Are there systems for which no information could be extraced out whatever is given to the

    enemy no matter how much text is intercepted

    In the analysis of these problems the concepts of entropy, redundancy, unicity distance and

    the like developed in A Mathematical Theory of Communication .

    Shannon's entropy represents the amount of information the experimenter lacks prior to

    learning the outcome of a probabilistic process. According to Shannon's formula, a message's

    entropy is maximized when the occurrence of each of its individual parts is equally probable. The

    entropy of a natural language is a statistical parameter that measures, how much information is

    produced on an average for every letter of a text in the language . The minimum number of bits

    required to encode all possible meanings of that message is the amount of information in a message.

    In cryptography unicity distance is the length of an original cipher text needed to break the

    cipher by reducing the number of possible spurious keys to zero in a brute force attack. That is, after

    trying every possible key, there should be just one decipherment that makes sense, i.e. expected

    amount of ciphertext needed to determine the key completely, assuming the underlying message has

    redundancy.

  • CHAPTER 3

    SPURIOUS KEYS

    &

    UNICITY DISTANCE

  • 3. SPURIOUS KEYS AND UNICITY DISTANCE

    3.1 INTRODUCTION

    Shannon proposed information theoretic approach in his paper, Communication Theory of

    Secrecy Systems, which explain different parameters related to the strength of cryptography. The

    resistance to brute-force attack or cipher text only attack is increased if there exists large number of

    text-like-texts corresponding to a cipher text. The keys which give text like text are spurious keys.

    The modern cryptosystem provides fair number of spurious keys to short texts of smaller size of less

    then 10 characters. This indicates that systems are more resistant to cipher text only attack to short

    messages. The probability of having spurious key during random decryptions depends on the

    distribution of text-like-texts in text-space. There is limitation by number of characters associated to

    a language script to determine text-like-texts. An English text has alphanumeric characters with

    some special characters but encrypted/decrypted texts has possibility to map a character to any 1-

    byte value resulting to transformation which does not look like a text. The set of spurious keys

    filters out invalid decryptions and leaves only text-like-texts from all possible decryptions. The

    difficulty in elimination of keys and sorting out the unique solution exists with spurious keys which

    can't be solved by simple attack mechanism.

    The number of spurious keys gradually decreases and reaches to certain negligible point

    with greater text size. There is a term which defines the threshold at which the probability of having

    spurious keys is negligible, unicity distance. Consider a random 1 byte key (k K,256 possible

    key) encrypts (xor) 1 letter (from 26-alphabet). The probability that the decrypted text is valid (an

    alphabet) is 26/256. The probability that one of the keys is spurious would be 1/256. This

    probability is attained when minimum of 3 characters are considered i .e . (26/256)^3 < 1/256. This

    implies if the size of the text is 3, then none of the 256 keys would give text with alphabets as

  • decrypted text, which is the unicity distance. The number of possible 1-byte values in text-space

    other then 26 alphabets are redundancy to the text space considered above. This redundancy is

    inversely related to the unicity distance. Had there been no redundancy in text space, unicity

    distance would be infinity.

    3.2 SPURIOUS KEYS ANALYSIS

    The analysis of spurious keys proposed in this thesis relies on discrete probability. The

    observation of random decryptions applying random keys and brute-forcing using numeric keys

    holds the Proof of Concept to this statistics. The discrete probability approach is applied with

    consideration of uniform distribution and definition of the text spaces and desirable events. The

    plaintexts has the limitation by the number of characters to used. Let us consider, English alphabets

    (nA=26) are allowed set of characters to plain text. The encrypted/decrypted text can have any 1-

    byte value(nU=n{0,1}8 =256). The probability that the decrypted text is text-like-text is called event.

    Probability of event for 1-character text is simply 26/256 as there are 26 favorable conditions from

    256 possibilities. The probability of events of different text size is presented in the Table 3.2.a .

    S.No. Text-Size P.E

    1 1 0.1016

    2 2 0.0103

    3 4 1.06*10 - 4

    4 8 1.13*10 - 8

    5 16 1.28*10 - 16

    Table 3.2.a: Probability of Event for different English text-size

  • The probability of event to occur during random decryption is fairly large with smaller text

    size and decreases gradually with size which is clearly illustrated in Table 3.2.a . With the text size

    greater then 25 the probability of event happens to be of less the 2 -80 which is a negligible value

    and happens not to occur in the life time. This point gives the unicity distance. The events discussed

    here is directly associated with the probability of meeting spurious keys during random decryptions.

    The Proof of Concept is based on the millions of decryptions performed on different modern

    cryptographic algorithms. The decryptions performed were based on two approaches: random

    decryption for intervals of time using random keys (alphabet-keys, numeric-keys, alpha-numeric

    keys) and brute-forcing using all possible numeric-keys. The observations hold strong as a proof of

    concept to the proposed statistics with minor variation within tolerable range.

    The Proof of Concept (PoC) based on the observation of spurious keys with random

    decryption approach on ARC4 algorithm for different keys sizes is illustrated in Table 3.2.b. The

    observation is based on the plain texts with text space of 62- alphanumeric characters. The

    decryption is carried out in number of random alphanumeric texts. The random alphanumeric keys,

    random alphabet-keys and random numeric keys of size 8-bytes and 16-bytes were taken for

    random decryptions. Similar Proof of Concept is presented in table 3.2.c which is based on the

    observations in 8-byte block ciphers with text size 8-byte. The observation was performed using

    random decryption and brute-forcing using 28 possible numeric keys.

  • S.No. Text-Size P.E 8 Bytes key 16 Bytes key

    1 1 0.2421 0.2423 0.2426

    2 4 3.44*10-3 3.46*10-3 3.47*10 -3

    3 8 1.18*10-5 1.09*10-5 1.06*10 -5

    Table 3.2.b Proof of Concept based on analysis of ARC4 Algorithm

    S.No. Algorithm P.E (x*10-5)

    1 Event Distribution 1.18

    2 ARC4 1.08

    3 DES 1.12

    4 DES3 0.99

    5 Blowfish 1.20

    Table 3.2.c: PoC based on analysis on different algorithms for 8 bytes text size

  • Figure 3.2.a: Variance on analysis on different algorithms for 8 bytes text size

    3.3. SPURIOUS KEYS ANALYSIS USING NATURAL LANGUAGE MODELS

    3.3.1 INTRODUCTION

    The natural languages needs certain encoding standard to feed to modern cryptosystem for

    encryption. Utf-8 encoding standard is widely popular as it accepts more then 1 million characters

    and leave no change on existing ASCII standard characters. The weight of the character codepoints

    of indian languages are 3-bytes. The heavier weight of the characters of indian scripts gives the

    possibility of larger universe to encrypted-decrypted texts and thus leading less probability of events

    to occur. The probability of events to occur based on different text size for Devanagari script is

    shown in table 3.3.a. It is clear that the unicity distance for Devanagari text is approximately 5

    which is far less then that of English text.

    Event Distribution ARC4 DES DES3 Blowfish0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    PoC (Text Size = 8 bytes)

    P.E (x*10-5)

    Algorithms

    varia

    nce

    at 1

    0-^5

    rang

    e

  • S.No. Text-Size P.E

    1 1 7.56*10-6

    2 2 5.73*10-11

    3 4 3.28*10 - 21

    4 8 1.08*10 - 41

    Table 3.3.a Probability of Event for Devanagari Script implementing UTF-8 Encoding

    The code-points associated with the characters can be mapped to a 1byte value for

    encryption and decryption which is discussed in sub-topic 3.3.2.

    3.3.2 CODE-POINT MAPPING

    Code-point mapping is a method proposed in this paper for fair implementation of natural

    language models. Basically, code-point mapping techniques works for the languages which has a

    maximum of 256 character codepoints associated to the script. The codepoint of a character is a

    number in hexadecimal. This codepoint basically carries two information; information of the

    language and specificity of the character. For example, the codepoint value of a character 'ka' ' '

    is 0x0915. Here '0x09' part of the number remains through out the script and '15' specifies the

    character. So, the part ( '15') of the number can be mapped to a 1-byte value ('\x15'). The mapped

    value is encrypted and the extracted part ('0x09') can be appended back to the encrypted data to

    obtain the cipher. The procedure of encryption and decryption is clearly illustrated in figure 3.3.2a.

  • Figure 3.3.2a Implementation of Code-point Mapping for Encryption and Decryption

    The probability of event for Devanagari text after the implementation of codepoint mapping,

    is fairly larger than for English. Table 3.3.2a provides the proof of concept for probability of events

    for Devanagari Script. The unicity distance where the probability of event becomes negligible

    (less than 2-80) is 82 which is fairly greater compared to English. Table 3.3.2b shows the comparison

    between the number of spurious keys for Devanagari, Bengali and English (alphanumeric text) with

    respect to different text size. Figure 3.3.2 a is the 3-D bar graph representation of the comparison

    which clearly provides two observations; a. The probability of having spurious keys at random

    decryption is fairly high to Devanagari script and the probability of the event decreases gradually

    with the text size.

  • S.No. Text

    P.E ARC4 DES

    1 4 0.0606 0.0603 X

    2 8 3.7*10 -3 3.6*10-3 3.69*10-3

    3 16 1.34*10 -5 1.3*10 -5 1.62*10 -5

    4 32 1.81*10-10 __ __

    Table 3.3.2a PoC of P.E for Devanagari texts based on analysis using ARC4 and DES algorithms

    S.No. Text Size

    Devanagari Bengali English

    1 4 0.0606 0.0167 3.4*10-3

    2 8 3.7*10-3 2.8*10-4 1.2*10-5

    3 16 1.3*10-5 7.7*10-8 1.4*10-10

    4 32 1.8*10-10 6.0*10-15 3.3*10-20

    Table 3.3.2.b Comparative analysis of spurious keys between Devanagari, Bengali and English(alphanumeric)

  • Figure 3.3.2.a Comparative analysis of spurious keys between Devanagari and English

    4 8 160

    0.01

    0.02

    0.03

    0.04

    0.05

    0.06

    0.07

    Analysis of Spurious keys

    Devnagari vs English

    Devanagari English

    Text Size

    Pro

    babi

    lity

    of E

    vent

  • CHAPTER 4

    TEXT-SPACE BOUNDARY

    IN

    STREAM CIPHERS

  • 4. TEXT-SPACE BOUNDARY IN STREAM CIPHER

    4.1 INTRODUCTION

    Stream ciphers is the practical application of one-time-pad. It consist of a Pseudo Random

    Generator (PRG) which takes the supplied key as a seed to the pseudo random generator. The PRG

    generates a long bit sequence which is equal to the text to be encrypted. The encryption is simply

    the one-time-pad of the text with generated bit stream. The one-time-pad holds a property with

    respect to the text space which can be illustrated with simplicity by considering a text-space with

    only two element '\x00' and '\x01' ie. U= {\x00,\x01}n where n is the size of text. The encrypted and

    decrypted texts can also be bounded to the same space-limit by applying MOD-2 operation to the

    code-points after XOR. This scenario is clearly shown with comparative illustration on existing

    model in figure 4.1.a and figure 4.1.b. We can clearly see that the event is likely to occur at every

    decryption irrespective of text size for the proposed model. The property holds true for text space

    of with set of elements of size 2x where x= 1 to 8 where MOD-2x. is applied to bound the encrypted-

    decrypted texts.

    K

    \x00\x01\x00.... ......

    \x00\x01\x00.... ......

    K

    Figure 4.1.a Simple Block Diagram of Stream Cipher for {\x00,\x01}n text-space

    ARC4ENC

    ARC4DEC

  • K

    \x00\x01\x00.... \x01\x01\x00...

    \x00\x01\x00... \x01\x01\x00......

    K

    Figure 4.1.b Stream Cipher with MOD 2 operation block for {\x00, \x01}n text space

    4.2 MATHEMATICAL MODELING:

    Let us define a simple model simple encryption and decryption model for stream cipher by

    equation (1) and (2) where m,c,d,k are respectively message, cipher, decrypted text, key and E(),

    D() be the encryption and decryption algorithm function which takes two arguments as given.

    c = E(m,k) ---------------(1)

    d= D(c,k) ------------------(2)

    For stream cipher E(m,k) and D(c,k) may be defined as below

    E(m,k) = mb OTP Kb --------(3)

    where Kb =G(k) and G() is a Pseudo Random Generator which takes key k and generates Kb

    D(c,k) = cb OTP Kb -------------- (4)

    ARC4ENC

    ARC4DEC

    %2

    %2

  • This implies,

    D(E(m,k),k) = m -------------------(5)

    A model is proposed in this thesis which is mathematically defined with limitation of boundary to

    text space. If a plain text of size n-byte be defined by

    m, where m U={\x00,\x01 . } n and size of the set {\x00,\x01 . } be given by 2p , p

    {1,2,...8} , then

    c=E'(m,k) = [ (x MOD 2p) : x = byte-value for each byte in E(m,k) ] --------(6)

    D'(c,k) = [ (x MOD 2p) : x = byte-value for each byte in D(c,k) ] -----------(7)

    case-1:

    For p =8, E'(m,k) = E(m,k) ----(8)

    D'(m,k)=D(m,k) ------(9) and hence, the model replicates the original stream cipher

    system with p=8

    case-2:

    For p=0, E'(m,k) = D'(m,k) -------(10) and hence, this does not follow general rule of crpytography

    The probability of collision increases with smaller value of p along with the condition that

    with larger text size (n) the probability of collision is depreciated. Thus, with possible tradeoffs this

    model can be implemented to attain large spurious keys and unicity distance.

    4.3 CODE-POINT MULTIMAPPING:

    The code point mapping approach to cryptography is mentioned in chapter 3, for

    implementation of natural language model. The use of boundary to text space in stream ciphers

    along with this approach is basically dealt in codepoint multimapping. The code-point

  • multimapping holds effective for language model with less then 128 characters and so is acceptable

    to English and Indic languages. Basically, it is found that the language characters maps either to

    upper half or the lower half of the 256 1-byte value. So, the idea of codepoint multimapping is to

    map the code-points twice to both the halves symmetrically and then apply encryption limiting the

    text-space boundary to half by implementing MOD 128 operation on stream ciphers as explained in

    topic 4.2. The universe sinks to half leveraging the probability of having more text-like-texts at

    random decryptions. The effect on text space with encryption and decryption model of codepoint

    multimapping is illustrated in the figure 4.3.a and 4.3.b respectively.

    Figure 4.3.a Encryption Model with Code-point Multimapping

  • Figure 4.3.b Decryption model with code-point Multimapping

    Table 4.3.a illustrates the comparison of probability of event of grneral ARC4 stream cipher

    and ARC4 with boundary limited by MOD 128 operation . It is noticed that the number of spurious

    keys is increased as the boundary of encrypted decrypted text space shrinks to half.

    S.No Text-Size S.C. Without MOD 128 S.C. With MOD 128

    1 8 1.2*10-5 3.0*10-3

    2 16 1.4*10-10 9.2*10-6

    3 32 1.96*10-20 8.4*10-11

    4 64 __ 7.1*10-21

    Table 4.3.a. Comparatively analysis of Probability of Events in Stream Cipher with and

    without MOD 128 operation

  • 4.3.1 ANALYSIS USING NATURAL LANGUAGE MODEL:

    The implementation of codepoint multimapping in stream ciphers result in fairly larger

    unicity distance. It is observed that probability of events to occur is fair with larger text size like

    256 characters and 512 characters. Table 4.3.1.a presents a comparative analysis of probability of

    events for codepoint mapping and code point multi mapping in stream cipher. The table is is

    illustrated with 3-D bar graph model in figure 4.3.1.b where we can easily see the leverage of

    probability of events in case of code-point multimapping.

    S.No Text-Size Codepoint Mapping (ARC4)

    Codepoint Multimapping(ARC4)

    1 8 3.7*10-3 0.94

    2 16 1.3*10-5 0.88

    3 32 1.8*10-10 0.78

    4 64 3.3*10-20 0.60

    5 128 __ 0.37

    6 256 __ 0.13

    Table 4.3.1.a: Comparative Analysis between Encryption Model implementing code-point

    mapping and Code-point Multimapping for Devanagari text.

  • Table 4.3.1.a: Comparative Analysis between Encryption Model implementing code-point

    mapping and Code-point Multimapping for Devanagari text.

    4.4 INDEX MAPPING

    Index mapping is an approach of implementation of stream cipher with boundary to text

    space. The implementation of boundary to text in stream cipher allowed anly ist 2P bytes ,p

    {1,2...8}, to appear in text space. For example for for p = 2, only ist 22 bytes i.e '\x00', '\x01' ,

    '\x02', '\x03' are allowed elements to text-spaces. These number of bytes can be mapped to the

    indices of equal set of desirable characters and implemented encryption. An example for p=3 is

    presented here.

    Let the set of desirable characters be A=['1','2','3','4','5','6','7','8'] now the indices mapping of

    8 16 32 640

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Stream cipher with MOD-128 for Devanagari

    S.C. Without MOD-128S.C. With MOD-128

    Text-Size

    P.E

  • elements to the first 23 8 bytes bounded to text-space is shown below:

    '1' '\x00' , '2' '\x01' , '3' '\x02 ' , '4' '\x03' ,

    '5' '\x04' '6' '\x05' , '7' '\x06' , '8 '\x07'

    Now, the mapped byte-value is encrypted and decrypted using stream cipher with the

    boundary to text-space and the cipher texts and decrypted texts are simply characters

    corresponding to indices retrieved by reverse mapping of the bytes received after encryption and

    decryption with MOD operation. The implementation of the same is illustrated in figure 4.4.a. In

    figure 4.4.b , the implementation of index mapping with MOD 32 operation and 32 character set is

    presented.

    Figure 4.4.a Index Mapping with text space of 8 characters

  • Figure4.4.b: Index Mapping with text space of 32 characters

    The index mapping provides boundary to the encrypted and decrypted text-space so that

    every cipher text or decrypted text will have elements belonging to the character space. This

    increases the possibility of having numbers of meaningful messages to one cipher text as illustrated

    in the figure 4.4.c.

    Figure 4.4.c: Spurious Keys Analysis with Index Mapping of 32 Characters

  • 4.5 MODEL OF ENCRYPTION FOR NUMERIC-STRINGS

    The numeric string or numeric text is composed of 10 characters belonging to set

    string.digits i.e. ['0','1','2','3','4','5','6','7','8','9']. An encryption scheme with 8 numeric characters set

    with boundary to text space is mentioned in the earlier topic. There are total 10C8 combinations of 8

    numeric characters set. The text space string.digits can be realized as 10C8 =45 unique text-spaces

    with 8-numeric characters. The implementation of 45 encryptions with boundary to those 45 text-

    spaces would result an equivalent encryption to numeric strings with encrypted-decrypted texts

    bounded to string.digits. Figure 4.5.a shows the implementation of numeric string encryption with

    boundary. The plain text applied to encryption is the list of phone numbers reported on March 2014.

    Figure 4.5.a: Numeric String Encryption with Text Space Boundary

    Now, the encryption model described is very difficult to mark or brute-force as each

    decryption will leads to a text-like-text as output. The partial encryption of texts can be helpful in

    leveraging confusion which is illustrated in figure 4.5.b and figure 4.5.c

  • Figure 4.5.b: Partial Encryption of text using Numeric-String Encryption Model

    (i) (ii) (iii)

    figure 4.5.c Data Frame Encryption using Numeric-String Encryption Model (i) Plain Text,

    (ii) Cipher Text, (iii) Decrypted Text

  • CHAPTER 5

    CONCLUSIONS

  • 5. CONCLUSIONS

    The analysis of spurious keys provides a vision of strengthening cryptography by leading

    cryptosystems beyond bruteforce bound. Some of the points were observed with statistics drove by

    an approach of random and brute-forced decryptions as a proof of concept in this research work.

    The analysis was done in modern cryptographic algorithms which include both block ciphers(DES,

    Blowfish) and stream cipher(ARC4).

    The probability of having spurious keys during random decryptions depends on the text-size and

    text-space (character set associated to the language used).

    Considering keys associated with the decrypted texts with elements within the character set of

    plain text space, as spurious keys, the probability of spurious keys to occur at random decryption

    for 8-byte texts with text space, 26 English alphabets is 10-8 whereas for 8 syllables text from

    Devanagari Script text is 3*10-3 when code-point mapping is implemented for encryption.

    The probability of event (text like text at random decryption) decreases gradually with text size

    and is negligible at certain point which gives unicity distance. Considering 2 -80 as a negligible

    probability, the unicity distance for 26 alphabet text is approximately 27 characters while that for

    Devanagari text round offs to 81 characters.

    The implementation of code-point multimapping hold very effective to Devanagari Script

    providing fair probability of spurious keys to longer texts of size 256, 512 and even more.

    The cryptanalysis of One Time Pad using multiple plain texts cipher texts pairs indicated that

    there is a possibility of bounding encrypted-decrypted texts to certain number of character space in

    stream ciphers. An encryption model is designed based on the property of OTP discovered for

    numeric string for which the unicity distance tends to infinity with a limit that only 10 numeric

    characters valid to encryption.