Similarity of Binaries through Re-optimization

45
Similarity of Binaries through re-Optimization By Yaniv David, Nimrod Partush & Eran Yahav Technion, Israel

Transcript of Similarity of Binaries through Re-optimization

Page 1: Similarity of Binaries through Re-optimization

Similarity of Binaries through re-Optimization

ByYaniv David, Nimrod Partush & Eran Yahav

Technion, Israel

Page 2: Similarity of Binaries through Re-optimization

Motivation

2

Developer

httpdOpenSSL

Page 3: Similarity of Binaries through Re-optimization

Motivation

3

SecurityResearcher

?

?

?

?

OpenSSL

Page 4: Similarity of Binaries through Re-optimization

Problem Definition

4

mov x0, x20mov x20, 3add x0, x0, 1sub x21, x21, x0cmn x21, 2

...

Procedure π’•πŸ

lea r15, [rax+1]sub r13, r15xor rax, raxadd rax, 3cmp r13, -2

...

Procedure π’•πŸ

mov edi, 10mov esi, 489mov edx, 0

...

Procedure π’•πŸ‘

mov rbp, rdimov r12d, esicmp dword ptr [rbp+64], 0jz short loc_143E

...

Procedure π’•πŸ’

push r12push rbxpush rbpsub rsp, 10

...

Procedure 𝒕|𝑻|…

Corpus 𝑻

lea r15, [rax+1]sub r13, r15xor rax, raxadd rax, 3cmp r13, -2

...

Procedure 𝒒

𝑻 β‰₯ 106

Procedure π’•πŸ

gcc 4.8–O0

icc 15.0.3–O3

CLang 3.4–Os

CLang4

–O1

icc 15.0.3–O3

gcc 4.6–Os

Page 5: Similarity of Binaries through Re-optimization

Challenge

lea r15, [rax+1]sub r13, r15xor rax, raxadd rax, 3cmp r13, -2

mov x0, x20mov x20, 3add x0, x0, 1sub x21, x21, x0cmn x21, 2

5

OpenSSL’sdtls1_buffer_message()

icc 15.0.3 –O3

gcc 4.8 –O0

Page 6: Similarity of Binaries through Re-optimization

Our ApproachFor finding Similarity of Binaries

6

Page 7: Similarity of Binaries through Re-optimization

Our Approach: What

7

push r12push rbxpush rbp

...

lea r15, [rax+1]sub r13, r15xor rax, raxadd rax, 3cmp r13, -2

Query 𝒒: dtls1_buffer_message()Compiler: icc 15.0.3 –O3Architecture:

Query π’•πŸ: dtls1_buffer_message()Compiler: gcc 4.8 –O0Architecture:

mov x0, x20mov x20, 3add x0, x0, 1sub x21, x21, x0cmn x21, 2

...

=...

πΉπ‘Žπ‘™π‘ π‘’π‘‡π‘Ÿπ‘’π‘’

...

...

Page 8: Similarity of Binaries through Re-optimization

Our Approach: What

8

push r12push rbxpush rbp

...

Query π’•πŸ: dtls1_buffer_message()Compiler: gcc 4.8 –O0Architecture:

mov x0, x20mov x20, 3add x0, x0, 1sub x21, x21, x0cmn x21, 2

...

...

πΉπ‘Žπ‘™π‘ π‘’π‘‡π‘Ÿπ‘’π‘’

...

...

lea r15, [rax+1]sub r13, r15xor rax, raxadd rax, 3cmp r13, -2

Query 𝒒: dtls1_buffer_message()Compiler: icc 15.0.3 –O3Architecture:

Page 9: Similarity of Binaries through Re-optimization

Our Approach: What

9

push r12push rbxpush rbp

...

Procedure π’•πŸ: unrelated()

Compiler: icc 15.0.3 –O3Architecture:

push r12push rbxpush rbp

...

...

=...

lea r15, [rax+1]sub r13, r15xor rax, raxadd rax, 3cmp r13, -2

...

πΉπ‘Žπ‘™π‘ π‘’π‘‡π‘Ÿπ‘’π‘’

Query 𝒒: dtls1_buffer_message()Compiler: icc 15.0.3 –O3Architecture:

Page 10: Similarity of Binaries through Re-optimization

Our Approach: How

β€’ Decompose procedure to fragments β€’ Transform fragments to canonical form

β€’ Count shared fragments while weighing in their statistical significance

10

tmp0 = register0 + 1register1 = tmp0tmp1 = register2 – tmp0register2 = tmp1tmp2 = tmp1 - 2

mov x0, x20add x0, x0, 1sub x21, x21, x0cmn x21, 2

lea r15, [rax+1]sub r13, r15cmp r13, -2

tmp0 = register0 + 1register1 = tmp0tmp1 = register2 – tmp0register2 = tmp1tmp2 = tmp1 - 2

=

Page 11: Similarity of Binaries through Re-optimization

Decomposing Assembly Procedures

β€’ The procedure is broken at basic block levelβ€’ Block ordering is ignored

11

push r12push rbxpush rbpsub rsp, 10

...

mov edi, 10...

lea r15, [rax+1]sub r13, r15

...

...

πΉπ‘Žπ‘™π‘ π‘’π‘‡π‘Ÿπ‘’π‘’

Page 12: Similarity of Binaries through Re-optimization

β€’ We use slicing to break basic blocks into separate data-independent computations:

mov x0, x20mov x20, 3add x0, x0, 1sub x21, x21, x0cmn x21, 2

Slice

Slicing Basic Blocks

mov x0, x20add x0, x0, 1sub x21, x21, x0cmn x21, 2

slice 1

mov x20, 3 slice 2

12

Page 13: Similarity of Binaries through Re-optimization

β€’ β€œOut-of-context” re-optimization

Optimize

Moving to Canonical Form (re-Optimizing)

13

Lift

mov x0, x20add x0, x0, 1sub x21, x21, x0cmn x21, 2

t0 = load x20store t0, x0t1 = load x0t2 = 1t3 = add t1, t2store t3, x0

...

LLVM

t0 = load x20t1 = add t0, 1store t1, x0

...

t0 = load r0t1 = add t0, 1store t1, r1

...

Normalize

Page 14: Similarity of Binaries through Re-optimization

Procedure Representation

14

dtls1_buffer_message()

{

,

, … }

R(dtls1_buffer_message)=

t0 = load r0t1 = add t0, 1store t1, r1t2 = load r2t3 = sub t2, t1store t3, r2

push r12push rbxpush rbpsub rsp, 10

...

mov edi, 10...

lea r15, [rax+1]sub r13, r15

...

...

πΉπ‘Žπ‘™π‘ π‘’π‘‡π‘Ÿπ‘’π‘’

t0 = load r0t1 = sub t0, 10t2 = load r1t3 = add t2, 3t3 = mul t3, t1store t3, r2

Page 15: Similarity of Binaries through Re-optimization

Computing Similarity

β€’Reminder:

π‘†π‘–π‘šπ‘–π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦(π‘ž, 𝑑) = |𝑅 π‘ž ∩ 𝑅 𝑑 |

15

push r12push rbxpush rbpsub rsp, 10

dtls1_buffer_message(), icc 15.0.3, –O3

unrelated(), icc 15.0.3, –O3

push r12push rbxpush rbpsub rsp, 10

=

Page 16: Similarity of Binaries through Re-optimization

Statistical Significance

β€’ Given a canonical fragment s, we need to determine its significance.

β€’ We estimate π‘Š with a bound, random sample of procedures 𝑃 –a β€œGlobal Context”

Prπ‘Š(𝑠) =

| 𝑝 ∈ π‘Š 𝑠 ∈ 𝑅(𝑝) }|

|π‘Š|Set of all

procedures, in existence

𝑃

𝑃𝑃

16

Page 17: Similarity of Binaries through Re-optimization

Computing Similarity

π‘†π‘–π‘šπ‘–π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦(π‘ž, 𝑑) = 𝑠 ∈

𝑅 π‘ž βˆ©π‘… 𝑑

1

π‘ƒπ‘Ÿπ‘ƒ(𝑠)

A sum ranging over the shared canonical

fragments

Of the inversed probability of each

fragment

17

A common fragment with 𝑷𝒓𝑷(𝒔) =

𝟏

πŸ“will

contribute 5

A distinctive fragment with 𝑷𝒓

𝑷(𝒔) = 𝟎. 𝟎𝟎𝟏

will contribute 1000

Page 18: Similarity of Binaries through Re-optimization

EvaluationPrototype Implementation: GitZ

18

Page 19: Similarity of Binaries through Re-optimization

1. Procedure π’•πŸ’πŸSimilarity: 170.34

2. Procedure π’•πŸπŸ‘Similarity: 168.91

3. Procedure π’•πŸ—πŸŽπŸŽSimilarity: 130.41

4. Procedure π’•πŸπŸπŸ–,πŸ•πŸ•πŸ•Similarity: 101.11

5. Procedure π’•πŸ’πŸ‘,πŸŽπŸ–πŸSimilarity: 13.19

…

GitZ Output

19

SecurityResearcher

Procedure π‘ž: OpenSSL’s dtls1_buffer_message()

Page 20: Similarity of Binaries through Re-optimization

Evaluation Corpus

β€’ Real-world code packagesβ€’ OpenSSL, , , , ,QEMU, wget, ffmpeg, Coreutilsβ€’ Containing ~11,000 procedures

β€’ Compiled to:β€’ x86_64 with CLang 3.{4,5}, gcc 4.{6,8,9} and icc {14,15}

β€’ ARM-64 with aarch64-gcc 4.8 and aarch64-Clang 4

β€’ Optimization levels -O{0,1,2,3,s}

β€’ Corpus size: 𝑇 = 45 βˆ— ~11,000 = ~500,000

β€’ Queries: 9 procedures from notable CVEs

20

x 9

x 5

Page 21: Similarity of Binaries through Re-optimization

1. Procedure π’•πŸ’πŸSimilarity: 170.34

2. Procedure π’•πŸπŸ‘Similarity: 168.91

3. Procedure π’•πŸ—πŸŽπŸŽSimilarity: 130.41

4. Procedure π’•πŸπŸπŸ–,πŸ•πŸ•πŸ•Similarity: 101.11

5. Procedure π’•πŸ’πŸ‘,πŸŽπŸ–πŸSimilarity: 13.19

…

500,000. Procedure π’•πŸ–πŸSimilarity: 0.0

GitZ Accuracy

β€’ Here, we report accuracy as the number of FPs ranked above the lowest TP

21

Negative

Positive

#𝑭𝑷 = 𝟏

𝑭𝑷𝒓 =𝟏

πŸ“πŸŽπŸŽ, 𝟎𝟎𝟎

Positive

Positive

Negative

Negative

Negative

Negative

Negative

Page 22: Similarity of Binaries through Re-optimization

GitZ Usefulness

β€’ How useful is GitZ in the vulnerability search scenario?

# Alias CVE #FPs FP Rate

1 Heartbleed 2014-0160 52 0.000104

2 Shellshock 2014-6271 0 0

3 Venom 2015-3456 0 0

4 Clobberin' 2014-9295 0 0

5 Shellshock #2 2014-7169 0 0

6 WS-snmp 2011-0444 0 0

7 wget 2014-4877 0 0

8 ffmpeg 2015-6826 0 0

9 WS-statx 2014-8710 0 0

22

Time

15m 17m 16m 16m 12m 14m 10m 17m 18m

β€’ How well does GitZ scale? Over 500K corpus 72 Cores

0.12s for query-target pair, single

core

GitZ Scalability

Page 23: Similarity of Binaries through Re-optimization

Evaluating Solution Components

β€’ How does each component of our solution affect the accuracy of GitZ?

23

Query Corpus Size |𝑇| #Positives

Heartbleed 10000 45

Page 24: Similarity of Binaries through Re-optimization

Evaluating the Global Context

1

10

100

1 101 201 301 401 501 601 701 801 901

#FP

Size of 𝑷 (in # Procedures)24

Reminder: 𝑷𝒓𝑷(𝒔) =

| π’‘βˆˆπ‘· π’”βˆˆ 𝑹(𝒑) }|

|𝑷|

Page 25: Similarity of Binaries through Re-optimization

Evaluating Solution VectorsFalse Positive Rate

25

Canonical w/GC

CanonNormLLVM

CanonLLVMw/ GC

CanonLLVM

NormLLVMw/ GC

NormLLVM

NormVEX w/

GC

NormVEX

CROC 0.9780.9270.8750.8510.9540.9080.9430.905

0

0.02

0.04

0.06

0.08

0.10

0.12

Page 26: Similarity of Binaries through Re-optimization

Evaluating Solution VectorsFalse Positive Rate

26

Canonical w/GC

CanonNormLLVM

CanonLLVMw/ GC

CanonLLVM

NormLLVMw/ GC

NormLLVM

LLVMw/ GC

LLVMFragme

ntsCROC 0.0220.0730.1250.1490.0460.0920.1220.153

0

0.02

0.04

0.06

0.08

0.10

0.12

Lifted (Only) LLVMFragments

Page 27: Similarity of Binaries through Re-optimization

Evaluating Solution Vectors

27

Canonical w/GC

CanonNormLLVM

CanonLLVMw/ GC

CanonLLVM

NormLLVMw/ GC

NormLLVM

LLVMw/ GC

LLVMFragme

ntsCROC 0.0220.0730.1250.1490.0460.0920.1220.153

LLVM

No Global Context

With Global Context

LLVM w/Global Context

False Positive Rate

0

0.02

0.04

0.06

0.08

0.10

0.12

Page 28: Similarity of Binaries through Re-optimization

Evaluating Solution Vectors

28

Canonical w/GC

CanonNormLLVM

CanonLLVMw/ GC

CanonLLVM

NormLLVMw/ GC

NormLLVM

LLVMw/ GC

LLVMFragme

ntsCROC 0.0220.0730.1250.1490.0460.0920.1220.153

No Global Context

With Global Context

LLVM LLVM w/ GC Normalized

LLVM Fragments

False Positive Rate

0

0.02

0.04

0.06

0.08

0.10

0.12

Page 29: Similarity of Binaries through Re-optimization

Evaluating Solution Vectors

30

Canonical w/GC

CanonNormLLVM

CanonLLVMw/ GC

CanonLLVM

NormLLVMw/ GC

NormLLVM

LLVMw/ GC

LLVMFragme

ntsCROC 0.0220.0730.1250.1490.0460.0920.1220.153

No Global Context

With Global Context

LLVM LLVM w/ GC

NormalizedLLVM Canonical

FragmentsOptimized

LLVM

False Positive Rate

0

0.02

0.04

0.06

0.08

0.10

0.12

Page 30: Similarity of Binaries through Re-optimization

Take Aways

β€’A procedure can be identified using a set of statistically significant fragmentsβ€’ The statistical data can be collected over a relatively

small set

β€’Applying an optimizer β€œout-of-context” is useful at transforming fragments to canonical formβ€’ A form that allows finding similarity

32

Page 31: Similarity of Binaries through Re-optimization

Questions

β€’ Canonical Form: The Good

β€’ Canonical Form: The Bad

β€’ Previous Work

β€’ BinDiff

β€’ More Experiments!

β€’ You’re over-thinking this

β€’ Out-of-Context?

β€’ Limitations

β€’ Evaluating Binary Classifiers

β€’ All v. All

β€’ Is Pr(s) a probability even?

β€’ The Global Context

Page 32: Similarity of Binaries through Re-optimization

Canonical Form: The Good

34

mov rax, -2sub rbx, rax

add r12, 2

t0 = load r0t1 = add 2, t0store t1, r0

t0 = load r0t1 = add 2, t0store t1, r0

Canonical

Canonical

=

add rax, 1add rbx, 2add rbx, rax

add rbx, 2add rax, 1add rax, rbx

t0 = load r0t1 = load r1t2 = add t0, t1t3 = add t1, 3store t3, r1

t0 = load r0t1 = add 2, t0store t1, r0

Canonical

Canonical

=The Bad:

Page 33: Similarity of Binaries through Re-optimization

Canonical Form: The Bad

35

add rax, 1add rbx, 2

add rbx, rax

add rax, 1add rbx, 2

add rax, rbx

t0 = load r0t1 = load r1t2 = add t0, t1t3 = add t1, 3

store t3, r1

Canonical

Canonical

β‰ t0 = load r0t1 = load r1t2 = add t0, t1t3 = add t1, 3

store t3, r0

Page 34: Similarity of Binaries through Re-optimization

Previous Work

36

Page 35: Similarity of Binaries through Re-optimization

BinDiff

Page 36: Similarity of Binaries through Re-optimization

The Global Context

β€’Sampled over canonical fragments, from procedures of all archs\compilers\optimizationsβ€’A new arch\compiler\optimization?

β€’ We are somewhat future proof due to optimizationβ€’ Even if a compiler decides to do things a bit different, it should

arrive at the same canonical form

β€’ Entirely new behaviors will require a (partial) resamplingβ€’ For instance: using the stack in offsets of 13 O_0

38

Page 37: Similarity of Binaries through Re-optimization

All v. All

39

Page 38: Similarity of Binaries through Re-optimization

Limitations

40

Page 39: Similarity of Binaries through Re-optimization

Evaluating Binary Classifiers

β€’ The Receiver Operating Characteristic (ROC) is a widespread method for evaluating a binary classifier, by plotting the ratio of TPs to FPs, for all the different thresholds

β€’ The Area Under Curve is then summed and value between 0-1 is produced. Our results were > .96.

β€’ ROC means β€œhow well did we cover all true positives, before we encounter false positives”

β€’ We used Concentrated ROC, an adaptation for huge corpora, which further β€œpunishes” highly ranked FPs

Page 40: Similarity of Binaries through Re-optimization

Jaccard Index?

β€’ Major difference: does not take statistical significance into account, at all.

42

𝐽 𝐴, 𝐡 =|𝐴 ∩ 𝐡|

|𝐴 βˆͺ 𝐡|

Page 41: Similarity of Binaries through Re-optimization

Is Pr(𝑠) a probability even?

β€’ Prπ‘Š(𝑠) is a probability over the sample space of π‘Š

β€’ π‘Š is a β€œmultiset” of all canonical fragments in existenceβ€’ | 𝑝 ∈ π‘Š 𝑠 ∈ 𝑅(𝑝) }| counts the occurrences of 𝑠 in π‘Šβ€’ Pr 𝑠1 + Pr 𝑠2 + Pr 𝑠3 + Pr 𝑠4 =

3

7+

2

7+ 1

7+ 1

7= 1

β€’ Pr𝑝(𝑠) is an estimation of π‘Š, which betters as 𝑃 growsβ€’ As we evaluated

Prπ‘Š(𝑠) =

| 𝑝 ∈ π‘Š 𝑠 ∈ 𝑅(𝑝) }|

|π‘Š|

π’”πŸπ’”πŸ

π’”πŸ

π’”πŸπ’”πŸ’π’”πŸ‘

π’”πŸ

π’‘πŸ‘

π’‘πŸπ’‘πŸ

𝑾

Page 42: Similarity of Binaries through Re-optimization

You’re over-thinking this

β€’Why not just run the binary and get a version string??β€’ Sometimes a lib is embeddedβ€’ You can’t always easily run (different arch,

dependencies, etc.)β€’ Running can put you in unnecessary riskβ€’ Purely staticβ€’ We’ve seen cases where the version string is not

maintained correctly!

Page 43: Similarity of Binaries through Re-optimization

Out-of-Context?

β€’ In context: The slice is surrounded with:β€’ Other instructions from the blockβ€’ Other blocksβ€’ Other procedures

The optimizer here must account for the surroundings, and cannot easily β€œcut-through” unrelated operations

β€’ Out-of-context: Just the slice. The optimizer can easily extract a concise canonical fragment, that can be matched with semantically equivalent fragments from other procedures.

Page 44: Similarity of Binaries through Re-optimization

More Experiments!

Scenario Queries Targets FP Rate

Cross-(Optimization ∨Version) x86_64

gcc 4.{6,8,9} –O* gcc 4.{6,8,9} –O* 0.001

Scenario Queries Targets FP Rate

Cross-Optimization ARM-64

aarch64-gcc 4.8 –O* aarch64-gcc 4.8 –O* 0

Scenario Queries Targets FP Rate

Cross-Compiler x86_64

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 –O1 πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 –O1 0.002

46

Page 45: Similarity of Binaries through Re-optimization

GitZ Accuracy: Cross-Compiler/Optimization/Architecture

Scenario Queries Targets FP Rate

Cross-ArchitectureLow Optimization

(πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 ∨

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π΄π‘…π‘€) –O1(πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 ∨

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π΄π‘…π‘€) –O10.006

Scenario Queries Targets FP Rate

Cross-ArchitectureStandard Optimization

(πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 ∨

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π΄π‘…π‘€) –O2(πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 ∨

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π΄π‘…π‘€) –O20.005

Scenario Queries Targets FP Rate

Cross-ArchitectureHeavy Optimization

(πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 ∨

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π΄π‘…π‘€) –O3(πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π‘₯86 ∨

πΆπ‘œπ‘šπ‘π‘–π‘™π‘’π‘Ÿπ‘ π΄π‘…π‘€) –O30.004

47