Audits, Triage,And the Future of Hashing

39
 Audits, Triage, and the Future of Hashing Jesse Kornblum

Transcript of Audits, Triage,And the Future of Hashing

Page 1: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 1/39

 Audits, Triage,

and the Future of Hashing

Jesse Kornblum

Page 2: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 2/39

2

Outline

• Identical Files

• Hash Collisions

• Why you should stop using MD5

• SHA-3

• NSRL

• Hash Set Auditing

• Fast Hash Matching

• Questions

Page 3: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 3/39

Identical Files

• Bit for bit identical

 – Identical content

 – Timestamps, filenames, may be different

• Cryptographic hashing does NOT prove files are identical

3

Page 4: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 4/39

Hash Collisions

• When two different inputs hash to the same value

• When A ≠ B, H(A) = H(B) 

• Extension of the Pigeon Hole Principle

4

Page 5: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 5/39

Hash Collisions

• Pigeon-Hole Principle

• 2128 possible MD5 values

• For a file of length n, 2n 

possible inputs

• For a 128KB file, there are2131,072 possible inputs

• 131,072 >> 128

• Therefore there will be

hash collisions

5

Picture courtesy Flickr user addedentry and used under a Creative Commons license, http://www.flickr.com/photos/addedentry/3273096118/

Page 6: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 6/39

Types of Attacks

• Collision Attack

 – Find m1 and m2 such that H(m1) = H(m2)

 – Can be used to create apps with different functionality but the

same hash

• But you can't choose the hash output

• Chosen Prefix Collision Attack

 – Given p1 and p2, find m1 and m2 such that H(p1||m1) = H(p2||m2)

 – Can be used to forge code signatures

• But you can't choose the hash output

6

Page 7: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 7/39

Types of Attacks

• Preimage Attack

 – Given hash output h, find m such that H(m) = h

 – Find a new input which matches a chosen hash

• That new input may not be meaningful

• Second Preimage Attack – Given m1, find m2 such that H(m1) = H(m2)

 – From existing exe, generate new file which has the same hash

• That new input may not be meaningful

7

Page 8: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 8/39

MD5 Attacks

• Published in 1992

• Cryptographically broken in 1996

 – Different from a practical break

• Collision technique developed in 2004 by Wang et al.

• Chosen prefix attack published in 2007

8

Page 9: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 9/39

MD5 Attack Demo

9

Page 10: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 10/39

Types of Attacks

• Collision Attack

 – Find m1 and m2 such that H(m1) = H(m2)

 – Takes seconds on a netbook

• Chosen Prefix Collision Attack

 – Given p1 and p2, find m1 and m2 such that H(p1||m1) = H(p2 ||m2)

• Preimage Attack

 – Given hash output h, find m such that H(m) = h

• Second Preimage Attack

 – Given m1, find m2 such that H(m1) = H(m2)

10

Page 11: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 11/39

Types of Attacks

• Collision Attack

 – Find m1 and m2 such that H(m1) = H(m2)

 – Takes seconds on a netbook

• Chosen Prefix Collision Attack

 – Given p1 and p2, find m1 and m2 such that H(p1||m1) = H(p2 ||m2) – Takes a cluster of Playstation 3s

• Preimage Attack

 – Given hash output h, find m such that H(m) = h

• Second Preimage Attack

 – Given m1, find m2 such that H(m1) = H(m2)

11

Page 12: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 12/39

The Future of MD5

12

Picture courtesy Flickr user katerha and used under a Creative Commons license, http://www.flickr.com/photos/katerha/4526272937/

Page 13: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 13/39

Hash Algorithms Families

• MD5 is based on Merkle –Damgård construction

 – Method to turn one-way compression functions into collision-

resistant hash functions

 – So are SHA-1 and SHA-2

• Weaknesses found in MD5 may be applicable to SHA-1 and SHA-2 – They may not survive long!

13

Page 14: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 14/39

That Algorithm is Our Last Hope

“No, there is another” 

14

Page 15: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 15/39

Hash Algorithm Families

• There are many other families

• Davies –Meyer

• Matyas –Meyer  –Oseas

• Hirose

• Miyaguchi –Preneel – Whirlpool algorithm

 – Part of md5deep suite

15

Page 16: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 16/39

SHA-3

• National Institute of Standards and Technology (NIST)

• Competition for SHA-3 standard

 – Much like Rijndael became “AES” 

• Three year process

• Finalists as of 23 Jan 2012:

 – BLAKE, Grøstl, JH, Keccak, Skein

• Final conference in Spring 2012 – http://csrc.nist.gov/groups/ST/hash/sha-3/index.html

16

Page 17: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 17/39

NSRL

• National Software Reference Library (NSRL)

 – Created by NIST in 2001

• Known files

 – Not guaranteed to be known good

 – NIST will hash anything – Vendor contact for each file

17

Page 18: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 18/39

NSRL

• Data set is HUGE

 – 78 million hashes in the set

 – 21 million unique hashes

 –  About 1.6GB of data

• Best resource that nobody uses• Lots of programs can parse the file format

 – But have trouble with the full data set

18

Page 19: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 19/39

Page 20: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 20/39

nsrlquery

Displays unknown files as default

$ md5deep –br * | nsrllookup

305e40dee29d261d0a3dc466f2184e35 unknown.exe

607e033a16006ed1e9987cfc62562f72 EVILEVIL.exe

Can also display known files

$ md5deep –br * | nsrllookup -k

e97295de2a9fde547feab4fe41df16ca mspaint.exe

eee470f2a771fc0b543bdeef74fceca0 msiexec.exe

20

Page 21: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 21/39

Kyrus NSRL Server

• Server requires about 1GB of RAM

 – Takes a while to start

• Kyrus is testing a public nsrlquery server with MD5 hashes

 – nsrl.kyr.us

 –  Add -s flag for remote server

C:\> md5deep * | nsrllookup -s nsrl.kyr.us

305e40dee29d261d0a3dc466f2184e35 unknown.exe

607e033a16006ed1e9987cfc62562f72 EVILEVIL.exe

21

Page 22: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 22/39

nsrlquery Demo

22

Page 23: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 23/39

Hash Set Auditing

• Hash sets are used to detect changes

 – Verifying the contents of downloaded file

 – Determining if your forensics tool has made any changes

• Hash set tools are great for detecting identical files

 – Break down when asked to detect changes

• Current Approaches

 – Report known files found

 – Or report unknown files found

 – Or report known files not found

23

Page 24: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 24/39

Hash Set Auditing

24

Page 25: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 25/39

Example - Bogocopy

C:\> dir /b src

foo.txt bar.txt

C:\> md5deep src\* > known.txt

C:\> bogocopy src dest

C:\> md5deep -lm known.txt dest\*

dest\foo.txt

dest\bar.txt

C:\> dir /b dest

foo.txt bar.txt CONFESSION.DOCX

25

Page 26: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 26/39

Hash Set Auditing

• Current Approaches

 – Report known files found

 – Or report known files not found

 – Or report unknown files found

• We want all three of these!•  Along with

 – Report known files found in new location

• Determine what is there

• Determine what's supposed to be there

• Highlight any mismatches

26

Page 27: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 27/39

Hash Set Auditing

• Hashdeep

 – Part of the md5deep suite

 – http://md5deep.sf.net/

• Can do positive and negative matching

• Multihashing• Hash set audits

 – Reports any mismatches

 – Finds new files, moved files, files not found

27

Page 28: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 28/39

Example – Bogocopy with Hashdeep

C:\> dir /b src

foo.txt bar.txt

C:\> hashdeep -b src\* > known.txt

C:\> bogocopy src dest

C:\> hashdeep -bak known dest\*

hashdeep: Audit failed

C:\> dir /b dest

foo.txt bar.txt CONFESSION.DOCX

28

Page 29: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 29/39

Example – Bogocopy with Hashdeep

C:\> hashdeep–

vvbak known.txt dest\*CONFESSION.DOCX: No match

hashdeep: Audit failed

Files matched: 2

Files partially matched: 0

Files moved: 0

New files found: 1

Known files not found: 0

29

Page 30: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 30/39

WoW64 Gotcha

• Windows on Windows64

 – x86 emulator for x64 based Windows systems

 – 32-bit view for 32-bit programs running on a 64-bit OS

 – http://msdn.microsoft.com/en-us/library/aa384249(v=vs.85).aspx

• So what?

• 32-bit programs have a different view

 – For example C:\Windows\System32

 – md5deep vs. md5deep64

30

Page 31: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 31/39

WoW64 Gotcha

• On a 64-bit OS, 32-bit programs see a different file

C:> md5deep Windows\System32\ieapfltr.dll

ee9d715af1b928982f417238b9914484

C:\Windows\System32\ieapfltr.dll

(This is actually C:\Windows\SYSWOW64\ieapfltr.dll)

C:\> md5deep64 Windows\System32\ieapfltr.dll

8eada158d964e3fd1999ad96c9c507ff

C:\Windows\System32\ieapfltr.dll

31

Page 32: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 32/39

Page 33: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 33/39

Fast Hash Matching

• I did not invent this

• Several other names

 – Fibonnacci hashing

 –  AccessData Triage Hashing

 – Piecewise Hashing

 – Partial hashing –  And many more

33

Picture courtesy Flickr user nvarvel and used under a Creative Commons li cense, http://www.flickr.com/photos/nvarvel/6269179660/

Page 34: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 34/39

Fast Hash Matching

• Traditional Approach:

• For each known file

 – Read and compute hash, H(known)

• For each unknown file: – Read and compute hash, H(unknown)

 – For each known hash:

• If H(unknown) == H(known)

 – Match!

34

Page 35: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 35/39

Fast Hash Matching

• Traditional Approach:

• For each known file

 – Read and compute hash, H(known)

• For each unknown file: – Read and compute hash, H(unknown)

 – For each known hash:

• If H(unknown) == H(known)

 – Match!

35

Page 36: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 36/39

 Assumptions

• Searching for identical files

 – Based on content

• If any part of the content is not the same, the files are not identical

• Example:

 – Identical files are the same size – If two files are not the same size, they are not identical

• Therefore we should compare file sizes first

 – Fast!

• Then part of the file

• Then the whole file

36

Page 37: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 37/39

Fast Hash Matching Demo

37

Page 38: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 38/39

38

Outline

• Identical Files

• Hash Collisions

• Why you should stop using MD5

• SHA-3

• NSRL

• Hash Set Auditing• Fast Hash Matching

Page 39: Audits, Triage,And the Future of Hashing

8/12/2019 Audits, Triage,And the Future of Hashing

http://slidepdf.com/reader/full/audits-triageand-the-future-of-hashing 39/39

Questions?

Jesse Kornblum

 [email protected]

39