Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i...

74
Hash Functions: A Gentle Introduction Palash Sarkar Applied Statistics Unit Indian Statistical Institute, Kolkata India [email protected] Indian Statistical Institute, 9 th December 2011 Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 1 / 23

Transcript of Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i...

Page 1: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions: A Gentle Introduction

Palash Sarkar

Applied Statistics UnitIndian Statistical Institute, Kolkata

[email protected]

Indian Statistical Institute,9th December 2011

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 1 / 23

Page 2: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Compressing Information

A hash function compresses arbitrary information to a short fixedlength string.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 2 / 23

Page 3: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Compressing Information

A hash function compresses arbitrary information to a short fixedlength string.Examples of information that can be compressed by a hash function.

An SMS/text message

A digital photo

An MP3 file

A book (e.g. ‘War and Peace’)

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 2 / 23

Page 4: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Compressing Information

A hash function compresses arbitrary information to a short fixedlength string.Examples of information that can be compressed by a hash function.

An SMS/text message

A digital photo

An MP3 file

A book (e.g. ‘War and Peace’)

The amount of information to be compressed varies, but, in each case,the compressed information will be a string of some fixed size.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 2 / 23

Page 5: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Function

H :L⋃

i=0

{0, 1}i → {0, 1}n.

Typically n is at least 160.

L should be “sufficiently large”.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 3 / 23

Page 6: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Function

H :L⋃

i=0

{0, 1}i → {0, 1}n.

Typically n is at least 160.

L should be “sufficiently large”.

Cryptography is about secrecy and things like that.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 3 / 23

Page 7: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Function

H :L⋃

i=0

{0, 1}i → {0, 1}n.

Typically n is at least 160.

L should be “sufficiently large”.

Cryptography is about secrecy and things like that.

There is no secret key in a hash function.

Yet, hash functions are one of the most important of cryptographicprimitives!

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 3 / 23

Page 8: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Function

H :L⋃

i=0

{0, 1}i → {0, 1}n.

Typically n is at least 160.

L should be “sufficiently large”.

Cryptography is about secrecy and things like that.

There is no secret key in a hash function.

Yet, hash functions are one of the most important of cryptographicprimitives!

(Certain kinds of hash functions do use a key.)

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 3 / 23

Page 9: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Why Are Hash Functions Important?

The ability to efficiently compress information is very useful.Computational task: Given x , compute H(x).

Software: very fast and small memory.Hardware: small hardware area, low power.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 4 / 23

Page 10: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Why Are Hash Functions Important?

The ability to efficiently compress information is very useful.Computational task: Given x , compute H(x).

Software: very fast and small memory.Hardware: small hardware area, low power.

Other algorithms work on the compressed information to realisedifferent cryptographic primitives.

Not all kinds of compression are useful.

The properties required of a hash function depend on theapplication.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 4 / 23

Page 11: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Some Basic Properties

Pre-Image Resistance (One-Wayness): Given an n-bit string y ,it should be ‘hard’ to find an x such that H(x) = y .

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 5 / 23

Page 12: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Some Basic Properties

Pre-Image Resistance (One-Wayness): Given an n-bit string y ,it should be ‘hard’ to find an x such that H(x) = y .

Collision Resistance: It should be ‘hard’ to find two distinctstrings x1 and x2 such that H(x1) = H(x2).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 5 / 23

Page 13: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Some Basic Properties

Pre-Image Resistance (One-Wayness): Given an n-bit string y ,it should be ‘hard’ to find an x such that H(x) = y .

Collision Resistance: It should be ‘hard’ to find two distinctstrings x1 and x2 such that H(x1) = H(x2).

Second Pre-Image Resistance: Given x1 it should be ‘hard’ tofind x2 such that H(x1) = H(x2).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 5 / 23

Page 14: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Some Basic Properties

Pre-Image Resistance (One-Wayness): Given an n-bit string y ,it should be ‘hard’ to find an x such that H(x) = y .

Collision Resistance: It should be ‘hard’ to find two distinctstrings x1 and x2 such that H(x1) = H(x2).

Second Pre-Image Resistance: Given x1 it should be ‘hard’ tofind x2 such that H(x1) = H(x2).

How hard is ‘hard’?

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 5 / 23

Page 15: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Some Basic Properties

Pre-Image Resistance (One-Wayness): Given an n-bit string y ,it should be ‘hard’ to find an x such that H(x) = y .

Collision Resistance: It should be ‘hard’ to find two distinctstrings x1 and x2 such that H(x1) = H(x2).

Second Pre-Image Resistance: Given x1 it should be ‘hard’ tofind x2 such that H(x1) = H(x2).

How hard is ‘hard’?

Formally defining ‘hard’ is tricky and involves considering aninfinite family of functions.

For concrete hash functions, parametrised approaches can beused.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 5 / 23

Page 16: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Relations Among Properties

If one can find second pre-images, then one can find collisions.Suppose A is an algorithm to find second pre-images.Take an arbitrary x1; use A on x1 to find a second pre-image x2;return x1 and x2.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 6 / 23

Page 17: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Relations Among Properties

If one can find second pre-images, then one can find collisions.Suppose A is an algorithm to find second pre-images.Take an arbitrary x1; use A on x1 to find a second pre-image x2;return x1 and x2.

No clear deterministic relation between finding pre-images andfinding collisions.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 6 / 23

Page 18: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Relations Among Properties

If one can find second pre-images, then one can find collisions.Suppose A is an algorithm to find second pre-images.Take an arbitrary x1; use A on x1 to find a second pre-image x2;return x1 and x2.

No clear deterministic relation between finding pre-images andfinding collisions.There is, however, a probabilistic relation.

Suppose B is an algorithm to find pre-images.Take an arbitrary x1; compute y = H(x1); use B on y to find apre-image x2; return x1 and x2.Under some relatively mild assumptions, x2 is different from x1 withsignificant probability.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 6 / 23

Page 19: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Relations Among Properties

If one can find second pre-images, then one can find collisions.Suppose A is an algorithm to find second pre-images.Take an arbitrary x1; use A on x1 to find a second pre-image x2;return x1 and x2.

No clear deterministic relation between finding pre-images andfinding collisions.There is, however, a probabilistic relation.

Suppose B is an algorithm to find pre-images.Take an arbitrary x1; compute y = H(x1); use B on y to find apre-image x2; return x1 and x2.Under some relatively mild assumptions, x2 is different from x1 withsignificant probability.

We provide some motivation for these properties.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 6 / 23

Page 20: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Digital Signature Scheme

Consists of three probabilistic algorithms (KeyGen,Sign,Verify).KeyGen generates a pair of keys (sk, vk).

sk is the secret signing key.vk is the public verification key.

Sign uses a signing key sk on a message M to produce asignature σ.

Verify uses a verification key vk on a message-signature pair(M, σ) to return valid/invalid.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 7 / 23

Page 21: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Digital Signature Scheme

Consists of three probabilistic algorithms (KeyGen,Sign,Verify).KeyGen generates a pair of keys (sk, vk).

sk is the secret signing key.vk is the public verification key.

Sign uses a signing key sk on a message M to produce asignature σ.

Verify uses a verification key vk on a message-signature pair(M, σ) to return valid/invalid.

Almost all known DSSs are based on number theoretic/algebraicgeometric computations.

If applied in a straightforward manner, for most practicalapplications, the performance would be unacceptably slow.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 7 / 23

Page 22: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash-Then-Sign

Given a DSS (KeyGen,Sign,Verify) and a hash function H

sign H(M).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 8 / 23

Page 23: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash-Then-Sign

Given a DSS (KeyGen,Sign,Verify) and a hash function H

sign H(M).

The sign/verify algorithms are always applied on n-bit stringsirrespective of the length of the message.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 8 / 23

Page 24: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Forgery: Some Possible Ways

Let (skA, vkA) be the sign/verify keys of Alice.

Eve wants to ‘forge’ Alice’s signature.

Valid forgery: a message signature pair which verifies with vkA butwas not produced using skA.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 9 / 23

Page 25: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Forgery: Some Possible Ways

Let (skA, vkA) be the sign/verify keys of Alice.

Eve wants to ‘forge’ Alice’s signature.

Valid forgery: a message signature pair which verifies with vkA butwas not produced using skA.

Eve gets two distinct messages M1,M2 such that H(M1) = H(M2);gets Alice to sign M1 to obtain signature σ; then (M2, σ) is a validforgery.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 9 / 23

Page 26: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Forgery: Some Possible Ways

Let (skA, vkA) be the sign/verify keys of Alice.

Eve wants to ‘forge’ Alice’s signature.

Valid forgery: a message signature pair which verifies with vkA butwas not produced using skA.

Eve gets two distinct messages M1,M2 such that H(M1) = H(M2);gets Alice to sign M1 to obtain signature σ; then (M2, σ) is a validforgery.

Prevented by collision resistance of H.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 9 / 23

Page 27: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Forgery: Some Possible Ways

Let (skA, vkA) be the sign/verify keys of Alice.

Eve wants to ‘forge’ Alice’s signature.

Valid forgery: a message signature pair which verifies with vkA butwas not produced using skA.

Eve gets two distinct messages M1,M2 such that H(M1) = H(M2);gets Alice to sign M1 to obtain signature σ; then (M2, σ) is a validforgery.

Prevented by collision resistance of H.

Alice signs a message M1 to produce a signature σ; Eve obtainsanother message M2 such that H(M1) = H(M2); then (M2, σ) is avalid forgery.

Prevented by second pre-image resistance of H.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 9 / 23

Page 28: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Message Authentication

Sender and Verifier share a common secret key K .

Given a message M, sender generates a tag for the messageusing K .

Given (M, tag), verifier uses K to determine whether this is validor invalid.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 10 / 23

Page 29: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Message Authentication

Sender and Verifier share a common secret key K .

Given a message M, sender generates a tag for the messageusing K .

Given (M, tag), verifier uses K to determine whether this is validor invalid.

Messages can be of variable and arbitrary lengths.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 10 / 23

Page 30: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Message Authentication

Sender and Verifier share a common secret key K .

Given a message M, sender generates a tag for the messageusing K .

Given (M, tag), verifier uses K to determine whether this is validor invalid.

Messages can be of variable and arbitrary lengths.

HMAC: A hash based approach; secret key (k , k ′)

tag = H(k ||H(k ′||M)).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 10 / 23

Page 31: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Construction of Hash Functions

A two-step engineering approach.Construct a function f which maps ℓ-bit strings to n-bit strings,with ℓ > n.

Such a function is called a compression function.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 11 / 23

Page 32: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Construction of Hash Functions

A two-step engineering approach.Construct a function f which maps ℓ-bit strings to n-bit strings,with ℓ > n.

Such a function is called a compression function.

Use f in some specific manner to construct a hash function Hwhich can compress arbitrary length strings.

Specific manner: called a mode of operation.A mode of operation should provide certain assurances, e.g., if f iscollision resistant, then so is H.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 11 / 23

Page 33: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

‘Provably Secure’ Hash Functions

There are known constructions of compression/hash functions suchthat finding pre-images and/or collisions amounts to solving certaincomputational problems which are conjectured to be hard.

Finding discrete logs.

Finding short vectors in lattices.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 12 / 23

Page 34: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

‘Provably Secure’ Hash Functions

There are known constructions of compression/hash functions suchthat finding pre-images and/or collisions amounts to solving certaincomputational problems which are conjectured to be hard.

Finding discrete logs.

Finding short vectors in lattices.

Such functions are usually slow and not suitable for ‘heavy duty’industrial applications.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 12 / 23

Page 35: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 36: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 37: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.Define a function H : {0, 1}2048 → {0, 1}256 as follows.

C1 = f (M1||0256); C2 = f (M2||C1); C3 = f (M3||C2); C4 = f (M4||C3).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 38: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.Define a function H : {0, 1}2048 → {0, 1}256 as follows.

C1 = f (M1||0256); C2 = f (M2||C1); C3 = f (M3||C2); C4 = f (M4||C3).

H(M1||M2||M3||M4) is defined to be C4 = Iterate(4)f (M).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 39: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.Define a function H : {0, 1}2048 → {0, 1}256 as follows.

C1 = f (M1||0256); C2 = f (M2||C1); C3 = f (M3||C2); C4 = f (M4||C3).

H(M1||M2||M3||M4) is defined to be C4 = Iterate(4)f (M).

If f is pre-image resistant, then so is H.

If f is collision resistant, then so is H.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 40: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.Define a function H : {0, 1}2048 → {0, 1}256 as follows.

C1 = f (M1||0256); C2 = f (M2||C1); C3 = f (M3||C2); C4 = f (M4||C3).

H(M1||M2||M3||M4) is defined to be C4 = Iterate(4)f (M).

If f is pre-image resistant, then so is H.

If f is collision resistant, then so is H.

0256 can be replaced by any 256-bit string (IV).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 41: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.Define a function H : {0, 1}2048 → {0, 1}256 as follows.

C1 = f (M1||0256); C2 = f (M2||C1); C3 = f (M3||C2); C4 = f (M4||C3).

H(M1||M2||M3||M4) is defined to be C4 = Iterate(4)f (M).

If f is pre-image resistant, then so is H.

If f is collision resistant, then so is H.

0256 can be replaced by any 256-bit string (IV).

Easy generalisation: f maps m bits to n bits and the domain of His k(m − n) for some fixed k > 1.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 42: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Iterating a Compression function

Let f : {0, 1}768 → {0, 1}256 be a compression function.

Suppose messages are strings of length 512 × 4 = 2048 bits.Let M = M1||M2||M3||M4 be a message where |Mi | = 512.Define a function H : {0, 1}2048 → {0, 1}256 as follows.

C1 = f (M1||0256); C2 = f (M2||C1); C3 = f (M3||C2); C4 = f (M4||C3).

H(M1||M2||M3||M4) is defined to be C4 = Iterate(4)f (M).

If f is pre-image resistant, then so is H.

If f is collision resistant, then so is H.

0256 can be replaced by any 256-bit string (IV).

Easy generalisation: f maps m bits to n bits and the domain of His k(m − n) for some fixed k > 1.

Needs to be modified to handle variable-length inputs.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 13 / 23

Page 43: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Handling Variable Length Inputs

Let f be an (m, n) compression function.

Given a message M, let

len(M) denote its length,

binm−n(len(M)) denote the (m − n)-bit encoding of its length.(assumption: messages are of maximum length 2m−n − 1.)

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 14 / 23

Page 44: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Handling Variable Length Inputs

Let f be an (m, n) compression function.

Given a message M, let

len(M) denote its length,

binm−n(len(M)) denote the (m − n)-bit encoding of its length.(assumption: messages are of maximum length 2m−n − 1.)

Define pad(M) to be M||0k ||binm−n(len(M)), where k is theminimum non-negative integer such that the entire length is amultiple of (n − m).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 14 / 23

Page 45: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Handling Variable Length Inputs

Let f be an (m, n) compression function.

Given a message M, let

len(M) denote its length,

binm−n(len(M)) denote the (m − n)-bit encoding of its length.(assumption: messages are of maximum length 2m−n − 1.)

Define pad(M) to be M||0k ||binm−n(len(M)), where k is theminimum non-negative integer such that the entire length is amultiple of (n − m).

Write pad(M) as M1||M2|| · · · ||Mℓ where each Mi is (m − n) bitslong.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 14 / 23

Page 46: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Handling Variable Length Inputs

Let f be an (m, n) compression function.

Given a message M, let

len(M) denote its length,

binm−n(len(M)) denote the (m − n)-bit encoding of its length.(assumption: messages are of maximum length 2m−n − 1.)

Define pad(M) to be M||0k ||binm−n(len(M)), where k is theminimum non-negative integer such that the entire length is amultiple of (n − m).

Write pad(M) as M1||M2|| · · · ||Mℓ where each Mi is (m − n) bitslong.

Define the output of hash function H to be Iterate(ℓ)f (pad(M)).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 14 / 23

Page 47: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Handling Variable Length Inputs

Let f be an (m, n) compression function.

Given a message M, let

len(M) denote its length,

binm−n(len(M)) denote the (m − n)-bit encoding of its length.(assumption: messages are of maximum length 2m−n − 1.)

Define pad(M) to be M||0k ||binm−n(len(M)), where k is theminimum non-negative integer such that the entire length is amultiple of (n − m).

Write pad(M) as M1||M2|| · · · ||Mℓ where each Mi is (m − n) bitslong.

Define the output of hash function H to be Iterate(ℓ)f (pad(M)).

If f is collision (resp. pre-image) resistant, then so is H.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 14 / 23

Page 48: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Variants

In the case where the lengths of messages can be encoded byfixed length bit strings, several variants are known.

These include important practical constructions such as MD/SHAfamily.

Puts the focus on constructing suitable compression functions.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 15 / 23

Page 49: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Variants

In the case where the lengths of messages can be encoded byfixed length bit strings, several variants are known.

These include important practical constructions such as MD/SHAfamily.

Puts the focus on constructing suitable compression functions.

A theoretical issue: tackling arbitrary length strings.

Damgård (1989): uses a padding rule which results in a messageexpansion which is linear in the length of the message.

Sarkar (2009): improved padding rule resulting in messageexpansion which is logarithmic in the length of the message.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 15 / 23

Page 50: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithm: Pre-Image

Model H as a uniform random function, i.e., on distinct inputs, theoutputs of H are independent and uniformly distributed over {0, 1}n.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 16 / 23

Page 51: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithm: Pre-Image

Model H as a uniform random function, i.e., on distinct inputs, theoutputs of H are independent and uniformly distributed over {0, 1}n.

Finding pre-image: input y .

Choose M; compute H(M); if H(M) = y , return M.

Probability of success: Pr[H(M) = y ] = 1/2n.

Expected number of trials: 2n.

Similarly, for finding second pre-image, the expected number of trials isalso 2n.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 16 / 23

Page 52: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithm: Collision

Choose distinct M1,M2, . . . ,Mq;compute y1 = H(M1), y2 = H(M2), . . . , yq = H(Mq);if yi = yj , return Mi ,Mj .

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 17 / 23

Page 53: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithm: Collision

Choose distinct M1,M2, . . . ,Mq;compute y1 = H(M1), y2 = H(M2), . . . , yq = H(Mq);if yi = yj , return Mi ,Mj .

Pr[Coll] = 1 − Pr[Distinct(y1, . . . , yq)].

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 17 / 23

Page 54: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithm: Collision

Choose distinct M1,M2, . . . ,Mq;compute y1 = H(M1), y2 = H(M2), . . . , yq = H(Mq);if yi = yj , return Mi ,Mj .

Pr[Coll] = 1 − Pr[Distinct(y1, . . . , yq)].

Pr[Distinct(y1, . . . , yq)] = Pr[yq /∈ {y1, . . . , yq−1}|Distinct(y1, . . . , yq−1)]

×Pr[Distinct(y1, . . . , yq−1)]

=

(

1 −q − 1

2n

)

× Pr[Distinct(y1, . . . , yq−1)]

· · · · · ·

=

(

1 −12n

)

× · · · ×

(

1 −q − 1

2n

)

.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 17 / 23

Page 55: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithm: Collision

Choose distinct M1,M2, . . . ,Mq;compute y1 = H(M1), y2 = H(M2), . . . , yq = H(Mq);if yi = yj , return Mi ,Mj .

Pr[Coll] = 1 − Pr[Distinct(y1, . . . , yq)].

Pr[Distinct(y1, . . . , yq)] = Pr[yq /∈ {y1, . . . , yq−1}|Distinct(y1, . . . , yq−1)]

×Pr[Distinct(y1, . . . , yq−1)]

=

(

1 −q − 1

2n

)

× Pr[Distinct(y1, . . . , yq−1)]

· · · · · ·

=

(

1 −12n

)

× · · · ×

(

1 −q − 1

2n

)

.

Using standard approximations and simplifications, for q ≈ 2n/2, acollision occurs with constant probability.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 17 / 23

Page 56: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithms: (Multi-)Collision

Modelling H as a uniform random function is an idealisation.

Concrete hash functions are not uniform random functions.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 18 / 23

Page 57: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Generic Algorithms: (Multi-)Collision

Modelling H as a uniform random function is an idealisation.

Concrete hash functions are not uniform random functions.

Bellare and Kohno (2004) introduced the notion of balance of hashfunction to express resistance to generic attacks.

Ramanna and Sarkar (2011) refined this approach and introduced thenotion of r -balance to quantify the resistance of concrete hash functionto generic multi-collision attacks.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 18 / 23

Page 58: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

It would be nice to say that a hash function ‘behaves’ like a uniformrandom function (a random oracle).

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 19 / 23

Page 59: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

It would be nice to say that a hash function ‘behaves’ like a uniformrandom function (a random oracle).

But, how to formalise this?

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 19 / 23

Page 60: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

It would be nice to say that a hash function ‘behaves’ like a uniformrandom function (a random oracle).

But, how to formalise this?

Compression Function + Mode of Operation = Hash Function.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 19 / 23

Page 61: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

It would be nice to say that a hash function ‘behaves’ like a uniformrandom function (a random oracle).

But, how to formalise this?

Compression Function + Mode of Operation = Hash Function.

Assume: Compression function is a random oracle.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 19 / 23

Page 62: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

It would be nice to say that a hash function ‘behaves’ like a uniformrandom function (a random oracle).

But, how to formalise this?

Compression Function + Mode of Operation = Hash Function.

Assume: Compression function is a random oracle.

The domain of a compression function consists of short fixedlength strings.

The range consists of shorter fixed length strings.

The domain of a hash function (is finite and) consists of long andvariable length strings.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 19 / 23

Page 63: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Problem (in a nutshell): Given a ‘small’ random oracle, is it possible toconstruct a function which is difficult to tell apart from a ‘big’ randomoracle?

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 20 / 23

Page 64: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Problem (in a nutshell): Given a ‘small’ random oracle, is it possible toconstruct a function which is difficult to tell apart from a ‘big’ randomoracle?

Adversary: An algorithm which tries to differentiate a hash functionfrom a random oracle.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 20 / 23

Page 65: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Problem (in a nutshell): Given a ‘small’ random oracle, is it possible toconstruct a function which is difficult to tell apart from a ‘big’ randomoracle?

Adversary: An algorithm which tries to differentiate a hash functionfrom a random oracle.

A hash function is public and can be queried by the adversary.

The compression function is also public and can also be queriedby the adversary.

Outputs of queries to the hash function and the compressionfunction must ‘match’.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 20 / 23

Page 66: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Problem (in a nutshell): Given a ‘small’ random oracle, is it possible toconstruct a function which is difficult to tell apart from a ‘big’ randomoracle?

Adversary: An algorithm which tries to differentiate a hash functionfrom a random oracle.

A hash function is public and can be queried by the adversary.

The compression function is also public and can also be queriedby the adversary.

Outputs of queries to the hash function and the compressionfunction must ‘match’.

Indifferentiability analysis of a mode of operation: To show that theadvantage of a resource-bounded adversary is ‘small’.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 20 / 23

Page 67: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Indifferentiability analysis has become an important tool to analysehash function modes of operations.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 21 / 23

Page 68: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Indifferentiability analysis has become an important tool to analysehash function modes of operations.

Provides opportunities for proving theorems using combinatorialand discrete probability calculations.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 21 / 23

Page 69: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Indifferentiability analysis has become an important tool to analysehash function modes of operations.

Provides opportunities for proving theorems using combinatorialand discrete probability calculations.

But, there are questions.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 21 / 23

Page 70: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Hash Functions and Random Oracles

Indifferentiability analysis has become an important tool to analysehash function modes of operations.

Provides opportunities for proving theorems using combinatorialand discrete probability calculations.

But, there are questions.

Is it really required?

Does it really show that there are no ‘defects’ in the mode ofoperation?

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 21 / 23

Page 71: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Summary

Brief discussions on the following questions/issues.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 22 / 23

Page 72: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Summary

Brief discussions on the following questions/issues.

What are hash functions?

Why are they important?

How to construct hash function?

Resistance to generic attacks.

Hash functions and random oracles.

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 22 / 23

Page 73: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Summary

Brief discussions on the following questions/issues.

What are hash functions?

Why are they important?

How to construct hash function?

Resistance to generic attacks.

Hash functions and random oracles.

Left out a lot!

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 22 / 23

Page 74: Hash Functions: A Gentle Introductionpalash/talks/hash-intro.pdf · Hash Function H : [L i=0 {0,1}i → {0,1}n. Typically n is at least 160. L should be “sufficiently large”.

Thank you for your attention!

Palash Sarkar (ISI, Kolkata) hash functions ISI 2011 23 / 23