A Graph Game Model for Software Tamper Protection

44
A Graph Game Model for Software Tamper Protection Information Hiding ‘07 June 11-13, 2007 Mariusz Jakubowski Ramarathnam Venkatesan Microsoft Research Nenad Dedić Boston University

description

A Graph Game Model for Software Tamper Protection. Information Hiding ‘07 June 11-13, 2007. Mariusz Jakubowski Ramarathnam Venkatesan Microsoft Research. Nenad Dedić Boston University. Overview. Introduction Past work on software protection Definitions of tamper-resistance - PowerPoint PPT Presentation

Transcript of A Graph Game Model for Software Tamper Protection

Page 1: A Graph Game Model for Software Tamper Protection

A Graph Game Model for Software Tamper Protection

Information Hiding ‘07June 11-13, 2007

Mariusz JakubowskiRamarathnam Venkatesan

Microsoft Research

Nenad DedićBoston University

Page 2: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 2

Overview

• Introduction• Past work on software protection• Definitions of tamper-resistance• Anti-tampering transformations• Security analysis• Conclusion

Modeling of software tamper-resistance

Page 3: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 3

Software Protection

• Obfuscation– Making programs “hard to understand”

• Tamper-resistance– Making programs “hard to modify”

• Obfuscation tamper-resistance• Tamper-resistance obfuscation?

Page 4: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 4

Formal Obfuscation• Impossible in general

– Black-box model (Barak et al.):“Source code” doesn’t help adversary who can examine

input-output behavior.– Worst-case programs and poly-time attackers

• Possible in specific limited scenarios– Secret hiding by hashing (Lynn et al.)– Point functions (Wee, Kalai et al.)

• Results difficult to use in practice.

Page 5: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 5

Tamper-resistance• Many techniques used in practice – e.g.:

– Code-integrity checksums– Anti-debugging and anti-disassembly methods– Virtual machines and interpreters– Polymorphic and metamorphic code

• Never-ending battle on a very active field– Targets: DRM, CD/DVD protection, games, dongles, licensing, etc.– Defenses: Binary packers and “cryptors,” special compilers,

transformation tools, programming strategies, etc.

• Current techniques tend to be “ad hoc:”– No provable security– No analysis of time required to crack protected instances

Page 6: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 6

Problem Definition

We would like an algorithm Protect roughly with following properties:For any program P, Protect(P) outputs a new program Q:• Q uses almost same resources as P.• For any attacker A, if A(Q) outputs Q’, then either:

o For any input x, Q’(x) = Q(x).o Q’ “crashes.”

Informally, tamper-protected P either works exactly like P or fails.

Page 7: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 7

Problems with the DefinitionFor any program P, Protect(P) outputs a new program Q:• Q uses almost same resources as P• For any attacker A, if A(Q) outputs Q’, then either:

o For any input x, Q’(x) = Q(x).o Q’ “crashes.”

It is unattainable.Example “attack:” A(Q) = “run Q; append 0 to output”.

Definition imprecise, but there is a bigger problem:

“Attack” is harmless, but breaks the definition.No easy way out!

Page 8: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 8

Towards a Realistic Model

• Give up on complete protection of P.o Protect mainly some critical code portion L.o Protect other parts to deflect attention away from L.

• Model restricted (but realistic) attackers.• Make engineering assumptions about security:

o Code transformationso Tamper detectiono Dataflowo Control flow

Page 9: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 9

Known Techniques and AttacksMain scenario:

Program P contains some security-critical code L.

For example:• L verifies that P is licensed software.• L verifies that P has a license for rendering content.• L contains important data (e.g., keys and credentials).• …

Next : Survey of known techniques and attacks to motivate the model and analysis.

Page 10: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 10

Single-Point Check

P

LL is called from P:if (L returns 1)

then proceed;else terminate;

Attack:Control-flow analysis can help identify L.Calls to L can then be patched.

Page 11: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 11

Distributed Check

PL is called from P:if (L returns 1)

then proceed;else terminate;

Attacks – based on flow graph:1. L is typically weakly connected to rest of P.2. Guess position of one copy of L. Use graph-diffing to

find other copies (subgraph matching).

L is broken up into pieces, and/orindividualized copies are replicated.

Page 12: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 12

Code Checksums

P

Attack:Reading code segment can be trapped(some hardware or VM support may be needed).Correct code segment can then be suppliedby cracked program or VM.

To prevent tampering, computechecksums C1,…,Ck of code segments. C1

C2

Ck

During execution, compare checksumsof loaded code segmentswith pre-computed values.

Page 13: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 13

Oblivious Hashing

P

Attacks:Precomputed hash values could be discovered.Code-replica scheme could be attacked usingprogram analysis (addressed in this work).

Main idea of OH:• Compute hashes H1,…,Hk of execution traces.

• Update hashes with values of assigned variables and identifiers based on control flow.

• Correct hashes can be precomputed and used to encrypt some data.

• Individualized code replicas can be created; OH values from each replica should be equal.

H1

Hk

H2

Page 14: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 14

Anti-disassembly

Attack:Vulnerable to attacks which do not use low-level details.E.g. “copy-attack”:To find out if branch B is causing crash, save statebefore B and try multiple paths.

Disassembling can be made difficultby virtualization and individualization.

Idea is to convert P into instances I=(VI,PI).VI - virtual machine. PI - implementation of P for VI.

Different instances I, J can have VI VJ .So disassembling I is of little help to disassemble J.

P

V1P1

V2P2

VnPn

Page 15: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 15

Defense Against Copy Attack

1. Crash only after multiple tampering changes detected.2. Have many possible crash locations.3. Delay the crash.4. Randomize execution paths.

Somewhat achievable using known techniques, e.g.:• Use redundant copies of encrypted data.• Make many code fragments dependent on checks.• Overlap code sections for context-dependent semantics.• Create multiple individualized copies of code.

Page 16: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 16

Defense Against Program AnalysisBasic notion: “local indistinguishability”

• Ideally, local observations of code/data/execution should give no useful information to attacker.

• In practice, try to satisfy this as much as possible.

1. Small code fragments all look alike.(E.g., use semantics-preserving peephole transformations.)

2. Control-flow graph looks like a complete graph.(E.g., use computed jumps and opaque predicates.)

3. Dataflow graph looks like a complete graph.(E.g., use lightweight encryption and temporary variable corruption.)

Page 17: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 17

Detecting Unusual Data/Code

Security-related data/code can look unusual and rare (e.g., XOR used for encryption and random data used for crypto keys both stand out and can be detected).

To mitigate, can use peephole transformations, near-clear encryption, data scattering, etc.

Page 18: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 18

Assortment of tools are available.

How to combine them effectively?

How much security can we get?

How to quantify security?

Page 19: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 19

Basic Model

• Program:– A graph G

• Execution:– A “random” walk on G

• Integrity check:– Group of nodes in G responsible for monitoring a set of code

fragments (probabilistically)• Check failure:

– Tampering flagged on all code fragments in a check’s set• Tamper response

– An action taken when a “sufficient” number of checks have failed

Abstraction of software tamper-resistance

Page 20: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 20

Elements of ModelLocal tamper check:C=InsertCheck(F1,…,Fs)• Check C of size s specified by s code fragments F1, …, Fs .• Each Fi detects tampering with probability p.• Check fails if each Fi detects tampering.

( Can have many checks C1,…,Ck .)

Tamper response:InsertResponse(P, (C1,…,Ck), f )• P “crashes” if at least f checks fail (f is the threshold).

(“Crash” could be any other form of response: slowdown, graceful degradation,loss of features, self-correction, etc.)

Page 21: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 21

Elements of ModelGraph transformations:(V,E)=GraphTransform(P, n)P is transformed into an equivalent program Qwith flow graph G=(V,E) containing n nodes.• G is random-looking. (rapid mixing of random walks).• Execution of Q induces a random-looking walk on G.

Critical-code embedding:F’=CodeEntangle(F, L)Critical code L is embedded into code fragment F, yielding F’.• F’ is equivalent to “ if L returns 1 then execute F ”.• Desirable to make embedded code hard to remove.

Page 22: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 22

The Algorithm

Harden(P, L, l, n, k, s, f):let G = (V,E) = GraphTransform(P, n)for i = 1 to l do

select at random vVv = CodeEntangle(L, v)

for i = 1 to k doselect at random (v1,…,vs)VCi = InsertCheck(v1,…,vs)

InsertResponse(G, (C1,…,Ck), f )

Main ideas: 1. Transform the flow graph into a “random” one.2. Replicate critical code in l random nodes.3. Randomly insert k checks of size s.4. Create check response with threshold f.

Page 23: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 23

The Algorithm

• Programmer assistance can help in algorithm:o Choose places to embed critical code L.o Identify code/data suitable for checking.o Identify code/data suitable for tamper response.

Page 24: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 24

Attack ModelAttacker plays a game on the program graph G.Goal: Run the program and avoid executing critical code L.

Game moves• Make a step on G:o either follow untampered execution of Po or tamper to change execution

(tampering detected by checks…)• Guess a check D=(u1,…,us).o If D=Ci , then Ci is disabled.

• If P crashes, restart.

Page 25: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 25

Attack ModelAttacker plays a game on flow-graph G=(V,E).

G

Page 26: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 26

Attack Model

Check = set of nodes. GAttacker plays a game on flow-graph G=(V,E).

Page 27: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 27

Attack Model

Check = set of nodes. G

= critical code

Attacker plays a game on flow-graph G=(V,E).

Page 28: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 28

Attack Model

Check = set of nodes. G

Execution = walk on G.

= critical code

Attacker plays a game on flow-graph G=(V,E).

Page 29: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 29

Attack Model

Check = set of nodes. G

Execution = walk on G.

= critical code

In each (random) step A can either:- observe

- models untampered execution

Attacker plays a game on flow-graph G=(V,E).

Page 30: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 30

Attack Model

Check = set of nodes. G

Execution = walk on G.

= critical code

In each (random) step A can either:- observe

- models untampered execution

- tamper current node

Attacker plays a game on flow-graph G=(V,E).

Page 31: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 31

Attack Model

Check = set of nodes. G

Execution = walk on G.

Check is activated whenall its nodes are tampered.

= critical code

In each (random) step A can either:- observe

- models untampered execution

- tamper current node

Attacker plays a game on flow-graph G=(V,E).

Page 32: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 32

Attack Model

Check = set of nodes. G

Execution = walk on G.

Check is activated whenall its nodes are tampered.

P crashes when f checks are activated.= critical code

In each (random) step A can either:- observe

- models untampered execution

- tamper current node

Attacker plays a game on flow-graph G=(V,E).

Page 33: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 33

Attack Model

Check = set of nodes. G

Execution = walk on G.

Check is activated whenall its nodes are tampered.

P crashes when f checks are activated.A tries to guess a check. = critical code

In each (random) step A can either:- observe

- models untampered execution

- tamper current node

Attacker plays a game on flow-graph G=(V,E).

Page 34: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 34

Attack Model

Check = set of nodes. G

Execution = walk on G.

Check is activated whenall its nodes are tampered.

P crashes when f checks are activated.A tries to guess a check.If guess is correct, the check is disabled (can’t be activated).

= critical code

In each (random) step A can either:- observe

- models untampered execution

- tamper current node

Attacker plays a game on flow-graph G=(V,E).

Page 35: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 35

Attack Model

Check = set of nodes. G

Execution = walk on G.

Check is activated whenall its nodes are tampered.

P crashes when f checks are activated.A tries to guess a check.If guess is correct, the check is disabled (can’t be activated).

= critical code

In each (random) step A can either:- observe

- models untampered execution

- tamper current node

Attacker plays a game on flow-graph G=(V,E).

Page 36: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 36

Attack Model

Check = set of nodes. G

Execution = walk on G.

= critical code

Attacker plays a game on flow-graph G=(V,E).

Game moves:observe, tamper, guess

Page 37: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 37

Attack Model

Check = set of nodes. G

Execution = walk on G.

= critical code

Attacker wins if:- P runs for >N steps without crashing.- Each step in critical code is tampered.

Attacker plays a game on flow-graph G=(V,E).

Game moves:observe, tamper, guess .

Page 38: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 38

Security EstimatesSecurity analysis in graph model.Parameters:k = cn (# of checks proportional to # of nodes)f = cn/2 (response threshold is half of the checks)p = 1 (tamper detection is perfect)l = n (critical code replicated in every node)N = n1+ (required running time before crash)

Analyzed attacks take (ns) time! (s = check size)

No proof yet for arbitrary attacks. More work needed…

Page 39: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 39

Security Arguments

P runs for >N steps

“Long” rapidly mixing random walk

Critical code encountered “many” times

A must tamper “many” nodes

Program crashes

Claim 1:As long as no check is disabled, A wins with exp. small prob.(Enough to have “not too many” checks disabled.)

Page 40: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 40

Security Arguments

Desired claim 2:Any O(ncs) attacker learns a check location with exp. small prob.

So far we only analyzed some specific attacks.No complete proof of above claim yet.

Claim 1:As long as no check is disabled, A wins with exp. small prob.(Enough to have “not too many” checks disabled.)

Claim 1 + Claim 2 No O(ncs) attacker can win.

Page 41: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 41

Attack 1: Voting AttackLet V={1,…,n}.Each check is an s-tuple of integers (v1,…,vs).

Main idea:• Suppose A tampers with P, which subsequently crashes.• Let WV denote the tampered nodes.• Then any s-tuple (v1,…,vs)Ws is more likely to be a check

than not.

So “vote” for all (v1,…vs) Ws .Do this D times and output k candidates with most votes.

Page 42: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 42

Attack 1: Voting AttackLet V={1,…,n}.Each check is an s-tuple of integers (v1,…,vs).

1. Fill an s-dimensional nn…n array B with zeros.2. for i=1 to D do

1. run P and tamper with it arbitrarily until it crashes(let W be the set of tampered nodes)

2. for each (v1,…,vs)Ws do B[v1,…,vs] = B[v1,…,vs] + 1

3. Find the k entries of B with highest valuesand output their indices as guesses for check nodes.

Can prove: Updating the table of votes takes ns steps.(Hence ns is lower bound on attack time.)

Page 43: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 43

Attack 2: Intersection AttackLet V={1,…,n}.Each check is an s-tuple of integers (v1,…,vs).

Main idea:Suppose A considers m tampered runs of P,with W1,…,Wm denoting sets of tampered nodes in each run.

If some check C= (v1,…,vs) is activated in all m runs,then C B = (W1 W2 … Wm)s .

For large enough m, B could be of tractable size, andA could search all of it.But small |B| are unlikely to contain any checks.

Can prove: Expected time to find a check is still >ncs.

Page 44: A Graph Game Model for Software Tamper Protection

Information Hiding ’07 June 11-13, 2007 44

Summary and Further Work• Main goals of work

o Modeling of software tamper-resistanceo Algorithms for tamper-resistance with analyzable

security

• Extensionso More realistic model:

• Allow some adversarial steps in walk.o More realistic parameters:

• p<1 – tamper detection unreliable• l<n – critical code replicated only in fraction of P• Other parameters: number of checks, threshold, etc.

o Implementation