Software-Based Approaches to Software...
Transcript of Software-Based Approaches to Software...
Skidmore College
Software-Based Approaches to Software
Protection
Ginger Myles
University of Arizona
Department of Computer Science
and
IBM Almaden Research Center
February 9, 2006 – p. 1
Skidmore College
Introduction
In this talk we will discuss...
The three major threats against IP in software
Software-based solutions
February 9, 2006 – p. 2
Skidmore College
3 Major Threats
Malicious Reverse Engineering: the extracting of a piece of aprogram in order to reuse it in ones own.
Software Tampering: the illegal modification of a program tocircumvent licence checks, to obtain access to digital mediaprotected by the software, etc.
Software Piracy: the illegal reselling of legally obtained copiesof a program.
February 9, 2006 – p. 3
Skidmore College
Malicious Reverse Engineering
Buy onecopy
Reusemodule
Sell
N Y
X
P
O
M
QM
Alice and Bob are competing software developers.
Bob reverse engineers Alice’s program and includes parts of it in
his own code.
Easier with Java bytecode, .NET, . . .
⇒ Alice obfuscates her code.
February 9, 2006 – p. 4
Skidmore College
What is Code Obfuscation?
T1 T2 T3P P′
A semantics-preserving transformation which
makes the program harder to understand
preserves original functionality
Idea is to obscure readability and understandability to such a
degree that it is less costly for the attacker to simply recreate
the program or purchase a legal copy.
February 9, 2006 – p. 5
Skidmore College
What is Code Obfuscation?
T1 T2 T3P P′
A semantics-preserving transformation which
makes the program harder to understand
preserves original functionality
Idea is to obscure readability and understandability to such a
degree that it is less costly for the attacker to simply recreate
the program or purchase a legal copy.
February 9, 2006 – p. 5
Skidmore College
Code Obfuscation
T1 T2 T3P P′
We want to develop an algorithm such that
Maximize obscurity
Maximize resilience
Minimize cost
February 9, 2006 – p. 6
Skidmore College
Code Obfuscation
Layout obfuscation: alter the information that is unnecessaryto the execution of the application such as identifier namesand source code formatting.
Data obfuscation: alter the data structures used by theprogram.
Control-flow obfuscation: disguise the true control flow of theapplication.
February 9, 2006 – p. 7
Skidmore College
Layout Obfuscation
Name obfuscation: rename the identifiers in the program tomeaningless names.
� �
c l a s s C{
vo id f oo ( ) { . . . }
vo id bar ( i n t i ) { . . . }
vo id f o oba r ( i n t i ) { . . . }
S t r i n g t o S t r i n g ( ) { . . . }
}� �
⇒
� �
c l a s s C{
vo id a ( ) { . . . }
vo id b ( i n t i ) { . . . }
vo id c ( i n t i ) { . . . }
S t r i n g t o S t r i n g ( ) { . . . }
}� �
February 9, 2006 – p. 8
Skidmore College
Layout Obfuscation
Name obfuscation: rename the identifiers in the program tomeaningless names.
� �
c l a s s C{
vo id f oo ( ) { . . . }
vo id bar ( i n t i ) { . . . }
vo id f o oba r ( i n t i ) { . . . }
S t r i n g t o S t r i n g ( ) { . . . }
}� �
⇒
� �
c l a s s C{
vo id a ( ) { . . . }
vo id a ( i n t i ) { . . . }
vo id b ( i n t i ) { . . . }
S t r i n g t o S t r i n g ( ) { . . . }
}� �
February 9, 2006 – p. 9
Skidmore College
Data Obfuscation
Promote primitive types: change all primitives into instances of their
respective wrapper classes.
� �pub l i c c l a s s C {
s t a t i c i n t gcd ( i n t x , i n t y ) {
i n t t ;
wh i l e ( t rue ) {
boolean b = x % y == 0;
i f ( b ) r e t u r n y ;
t = x % y ;
x = y ;
y = t ;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ;
}
}
� �
⇒
� �pub l i c c l a s s C {
s t a t i c I n t e g e r gcd ( Integer x , Integer y){
Integer t = n u l l ;
wh i l e ( t rue ){
Boolean b = new Boolean ( x.intValue() %
y.intValue() == 0);
i f ( b.booleanValue() )
r e t u r n y ;
t = new Integer ( x.intValue() %
y.intValue() ) ;
x = y ;
y = t ;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t l n ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( new Integer(100) ,
new Integer(10) ) ;
}
}
� �
February 9, 2006 – p. 10
Skidmore College
Data Obfuscation
Boolean splitter: every boolean variable is split into two variables and the
state of the original variable is reflected in the combined state of the two
variables.
� �pub l i c c l a s s C {
s t a t i c i n t gcd ( i n t x , i n t y ) {
i n t t ;
wh i l e ( t rue ) {
boolean b = x % y == 0;
i f ( b ) r e t u r n y ;
t = x % y ; x = y ; y = t ;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ;
}
}
� �
⇒
� �pub l i c c l a s s C {
s t a t i c i n t gcd ( i n t x , i n t y ) {
i n t t8, t7 , t ;
f o r ( ; ; ) {
i f ( x%y==0) { t8=1;t7=0; }
e l s e { t8=0;t7=0; }
i f ( (t7ˆt8)!=0 )
r e t u r n y ;
e l s e {
t=x%y ; x=y ; y=t ;
}
}}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a ) {
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ; }
}
� �
February 9, 2006 – p. 11
Skidmore College
Control-flow Obfuscation
Can be based on normal transformations an optimizingcompiler would perform
method inlining
loop unrolling
February 9, 2006 – p. 12
Skidmore College
Opaque Predicates
Opaque Predicate: A predicate P is opaque at a programpoint p, if at point p the outcome of P is known atembedding time. If P always evaluates to True we write P T
p ,
for False we write P Fp , and if P sometimes evaluates to
True and sometimes to False we write P ?p .
Inserted to make it difficult for an adversary to analyze thecontrol-flow of the application.
February 9, 2006 – p. 13
Skidmore College
Sample Opaque Predicates
Number theoretically true opaque predicates
∀x, y ∈
�
7y2 − 1 6= x2
∀x ∈
�
2|bx2
2 c
∀x ∈
�
2|x(x + 1)
∀x ∈
�
x2 ≥ 0
∀x ∈
�
3|x(x + 1)(x + 2)
∀x ∈
�
7 6 |x2 + 1
∀x ∈
�
81 6 |x2 + x + 7
∀x ∈�
19 6 |4x2 + 4
∀x ∈
�
4|x2(x + 1)(x + 1)
February 9, 2006 – p. 14
Skidmore College
Control-flow Obfuscation
Bogus branch: use an opaque predicate to insert a bogus if statement.
� �pub l i c c l a s s C {
s t a t i c i n t gcd ( i n t x , i n t y ) {
i n t t ;
wh i l e ( t rue ) {
boolean b = x % y == 0;
i f ( b ) r e t u r n y ;
t = x % y ; x = y ; y = t ;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ;
}
}
� �
⇒
� �pub l i c c l a s s C {
s t a t i c i n t gcd ( i n t x , i n t y ) {
i n t t ;
i n t x = 2 ;
wh i l e ( t rue ) {
boolean b = x % y == 0;
i f ( b ) r e t u r n y ;
t = x % y ; x = y ; y = t ;
if(x*x < 0) return 0;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ;
}
}
� �
February 9, 2006 – p. 15
Skidmore College
Code Obfuscation Example
Name obfuscation, boolean splitter, bogus branch, promote primitives
� �pub l i c c l a s s C {
s t a t i c i n t gcd ( i n t x , i n t y ) {
i n t t ;
wh i l e ( t rue ) {
boolean b = x % y == 0;
i f ( b ) r e t u r n y ;
t = x % y ; x = y ; y = t ;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ;
}
}
� �
⇒
� �pub l i c c l a s s C {
s t a t i c I n t e g e r a ( I n t e g e r x , I n t e g e r y ) {
I n t e g e r t8 , t7 , t ;
I n t e g e r b = new I n t e g e r ( 2 ) ;
f o r ( ; ; ) {
i f ( x . i n tVa l u e ( ) % y . i n tVa l u e () == 0){
t8 = new I n t e g e r ( 1 ) ; t7 = new I n t e g e r ( 0 ) ;
}e l s e {
t8 = new I n t e g e r ( 0 ) ; t7 = new I n t e g e r ( 0 ) ;
i f ( ( t7 . i n tVa l u e ( )ˆ t8 . i n tVa l u e ( ) ) != 0){
r e t u r n y ;
}e l s e {
t = new I n t e g e r ( x . i n tVa l u e () %
y . i n tVa l u e ( ) ) ;
x = y ; y = t ;
}
i f ( b . i n tVa l u e ( ) ∗ b . i n tVa l u e () < 0)
r e t u r n 0 ;
}
}
pub l i c s t a t i c vo id main ( S t r i n g [ ] a){
System . out . p r i n t ( ”Answer : ” ) ;
System . out . p r i n t l n ( a (new I n t e g e r (100) ,
new I n t e g e r ( 1 0 ) ) ) ;
}
}
� �February 9, 2006 – p. 16
Skidmore College
Software Tampering
Receivetrialsoftware
Removeexpiration check
if(date > Feb 9, 2006) {
display("trial expired");
exit();
}
C C′
Alice is software developer.
Alice gives Bob a trial version of her software.
Bob locates the expiration check and disables it. He now hasunlimited use of the software.
⇒ Alice tamperproofs her program.
February 9, 2006 – p. 17
Skidmore College
What is Software Tamperproofing?
W W
Alice Bob
T
P P′ P′′P′′′
K
Code obfuscation is used to hide a secret.
Tamperproofing is used to protect the secret from alteration .
Tamperproofing performs two duties
1. Detect that the software has been altered.
2. Once detection has occurred, cause the program to fail orrepair itself.
February 9, 2006 – p. 18
Skidmore College
What is Software Tamperproofing?
W W
Alice Bob
T
P P′ P′′P′′′
K
Code obfuscation is used to hide a secret.
Tamperproofing is used to protect the secret from alteration .
Tamperproofing performs two duties
1. Detect that the software has been altered.
2. Once detection has occurred, cause the program to fail orrepair itself.
February 9, 2006 – p. 18
Skidmore College
Software Tamperproofing
W W
Alice Bob
T
P P′ P′′P′′′
K
We want to design an algorithm such that
software failure is stealthy
does not alert the attacker to the location of the failureinducing code.
February 9, 2006 – p. 19
Skidmore College
Software Tamperproofing
W W
Alice Bob
T
P P′ P′′P′′′
K
We want to design an algorithm such that
software failure is stealthy
does not alert the attacker to the location of the failureinducing code.
� �
i f ( tamper ing )
abo r t ( ) ;� �
February 9, 2006 – p. 20
Skidmore College
Software Tamperproofing
Tamper detection techniques
Inspect the code
Inspect the state
Generate code
February 9, 2006 – p. 21
Skidmore College
Software Tamperproofing
Tamper detection techniques
Inspect the code
Inspect the state
Generate code
No matter which technique, the expected result should not berevealed to the attacker.
� �
i f ( cu r r en tDa t e >= Feb 9 , 2 006 )
e x i t ( ) ;� �
February 9, 2006 – p. 22
Skidmore College
Software Tamperproofing — Aucsmith
Upper
Lower
Decrypt&JumpCode
EncryptedCode block #2
Decrypt&JumpCode
Code block #3Encrypted
Decrypt&JumpCode
EncryptedCode block #4
Code block #1Plaintext
Decrypt&JumpCode U
pper
Lower
Decrypt&JumpCode
EncryptedCode block #4
Decrypt&JumpCode
Code block #1Encrypted
Decrypt&JumpCode
EncryptedCode block #2
Decrypt&JumpCode
Code block #3Plaintext
⊗⊗
Tamper-proofObfuscate/
⊗
⊗
P
Key is the integrity verification kernels.
Each IVK contains 2n blocks of code.
Half are in upper memory, half in lower memory.
XOR each block in upper memory with a block in lower memory.
Result one block is decrypted and execution resumes at that block.
February 9, 2006 – p. 23
Skidmore College
Software Piracy
Resell
Make illegalcopies
Buy onecopy P
P
P
Alice is a software developer.
Bob buys one copy of Alice’s application and sells copies to third
parties.
⇒ Alice watermarks/fingerprints her program.
February 9, 2006 – p. 24
Skidmore College
What is Software Watermarking?
A technique used to aid in the prevention of software piracy.
Embed a unique identifier in a program.
Watermarking
Same identifier
Copyright notice
Discourages theft
Fingerprinting
Different identifier
Customer identification
Trace illegal copies
Discourages but does not prevent illegal copying and redistribution.
February 9, 2006 – p. 25
Skidmore College
Software Watermarking
A watermarking system consists of two functions:
embed(P, w, key) → P ′
recognize(P ′, key) → w
Watermarked Program
OriginalProgram
EmbedWatermark
ExtractWatermark
WW KK
P P′
February 9, 2006 – p. 26
Skidmore College
Software Watermarking
EmbedWatermark
AttackWatermark
OriginalProgram
Watermarked Program
ExtractWatermark
Attacked Program
Watermark
WW WP′
K
P P′
K
We want to develop an algorithm such that when we embedthe watermark W in the program P
W is resilient to various attacks.
W is stealthy.
W is large (high bit-rate).
The overhead (space and time) is low.
February 9, 2006 – p. 27
Skidmore College
Attacks on Software Watermarks
Subtractive Attack: The adversary examines the(disassembled/de-compiled) program in an attempt to discoverthe watermark and to remove all or part of it from the code.
W
Alice
KP′′
Bob
P′PK
February 9, 2006 – p. 28
Skidmore College
Attacks on Software Watermarks
Additive Attack: The adversary adds a new watermark inorder to make it hard for the IP owner to prove that herwatermark is actually the original.
W W1
W1
WW
AdditiveAttack
P′P
K
Alice Bob
K1
KP′′
February 9, 2006 – p. 29
Skidmore College
Attacks on Software Watermarks
Distortive Attack: A series of semantics-preservingtransformations are applied to the software in an attempt torender the watermark useless.
W W’ W’
DistortiveAttack
Bob
K
Alice
PK
P′ P′′
February 9, 2006 – p. 30
Skidmore College
Attacks on Software Watermarks
Collusive Attack: The adversary compares two copies of thesoftware which contain different fingerprints in order toidentify the location.
F1
F2
CollusiveAttackP1
Alice Bob
PPK1
K2
P2
February 9, 2006 – p. 31
Skidmore College
Naive Watermarking Techniques
Constant String
� �
S t r i n g watermark = ‘ ‘ Copy r i gh t 2 0 0 6 . . . ’ ’ ;� �
� �
S t r i n g watermark = ‘ ‘CC Number 1 2 3 4 . . . ’ ’ ;� �
Easy to attack, unstealthy, high bit-rate, little overhead.
February 9, 2006 – p. 32
Skidmore College
Naive Watermarking Techniques
Switch Encoding
� �
switch ( E ) {
case 1 : { · · ·}
case 5 : { · · ·}
case 9 : { · · ·}
}� �
⇒
� �
switch ( E ) {
case 5 : { · · ·}
case 1 : { · · ·}
case 9 : { · · ·}
}� �
Easy to attack, stealthy , low bit-rate , no overhead.
February 9, 2006 – p. 33
Skidmore College
Watermarking Transformations
Naive approaches:
Renaming
L: X:EmbedP P′
Reordering
Embed
P P′
February 9, 2006 – p. 34
Skidmore College
Watermarking Transformations
Naive approaches:
Renaming
L: X:EmbedP P′
Reordering
Embed
P P′
February 9, 2006 – p. 34
Skidmore College
Watermarking Transformations
Advanced approaches:
Alter program statistics
Embed
Extend program semantics
Embed
February 9, 2006 – p. 35
Skidmore College
Watermarking Transformations
Advanced approaches:
Alter program statistics
Embed
Extend program semantics
Embed
February 9, 2006 – p. 35
Skidmore College
Monden et al.
Semantics extending transformation.
Embeds the watermark in a dummy method that is added tothe application.
The embedding is accomplished through a speciallyconstructed sequence of instructions.
Since the inserted method is never executed there is flexibilityin how the instructions are constructed.
Can disguise the method by adding a call to the methodwhich is regulated by an opaque predicate.
COMPSAC 2000.
February 9, 2006 – p. 36
Skidmore College
Monden et al.
Encode 8 bits of the watermark by replacing the operand of every
BIPUSH instruction.
� �void whileI( ){
int i = 0;
while( i < 100 ){
i++;
}
}� �
⇒
� �0 i c o n s t 0 // push i n t con s t an t 0
1 i s t o r e 1 // s t o r e i n t o l o c a l v a r i a b l e 1
2 goto 8 // f i r s t t ime no inc r ement
5 i i n c 1 1 // add 1 to l o c a l v a r i a b l e 1
8 i l o a d 1 // l oad from l o c a l v a r i a b l e 1
9 b ipu sh 100 // push a sma l l i n t ( 100 )
11 i f i c m p l t 5 // compare , i f t r u e goto 5
14 r e t u rn // r e t u r n vo i d when done� �
February 9, 2006 – p. 37
Skidmore College
Monden et al.
Encode 3 bits of the watermark by replacing each arithmetic instruction.
iadd 000
iand 001
ior 010
ixor 011
irem 100
idiv 101
imul 110
isub 111
� �i n t f a c t ( i n t x){
i n t f a c t o r i a l = 1 ;
f o r ( i n t i =1; i <= x ; i++){
f a c t o r i a l ∗= i ;
}
r e t u r n f a c t o r i a l ;
}
� �
⇒
� �0 i c o n s t 1
1 i s t o r e 1
2 i c o n s t 1
3 i s t o r e 2
4 goto 14
7 i l o a d 1
8 i l o a d 2
9 imu l
10 i s t o r e 1
11 i i n c 2 1
14 i l o a d 2
15 i l o a d 0
16 i f i c m p l e 7
19 i l o a d 1
20 i r e t u r n
� �
February 9, 2006 – p. 38
Skidmore College
Arboit
Semantics extending transformation.
k branching points throughout the application are randomlyselected.
At each branching point either ∧P T , ∨¬P T , or ∨P F isappended to the predicate at that location.
The bits of the watermark are embedded through the opaquepredicate that has been chosen.
Within the opaque predicate the bits can be encoded eitheras constants or by assigning a rank to each of the opaquepredicates.
ICECR-5, 2002.
February 9, 2006 – p. 39
Skidmore College
Arboit Algorithm 1 Example
� �c l a s s C{
vo id m1( i n t a , i n t b){
. . .
i f ( a <= b ){ . . .}
e l s e { . . .}
. . .
}
}� �
W⇒
� �c l a s s C{
vo id m1( i n t a , i n t b){
. . .
i n t c=1;
i f ( ( a <= b) && ( c∗c >= 0)){ . . .}
e l s e { . . .}
. . .
}
}� �
February 9, 2006 – p. 40
Skidmore College
Qu and Potkonjak
5 2
4 3
1
5 2
4 3
1
5 2
4
1
3
Original Marked New coloring
⇒ ⇒
Renaming transformation.
Embed the mark by adding constraints (extra edges) to the register interference
graph.
Easy to attack by random register re-numbering.
3rd International Information Hiding Workshop, 1999.
February 9, 2006 – p. 41
Skidmore College
Interference Graph
Models the relationship between the variables in the procedure.
Each variable in the procedure is represented by a vertex.
If two variables have overlapping live ranges then the verticesare joined by an edge.
The graph is colored so that we can assign the variables toregisters so that we minimize the number of registers requiredand variables that are live at the same time do not share aregister.
February 9, 2006 – p. 42
Skidmore College
Interference Graph Example
� �
v1 := 2 * 2
v2 := 2 * 3
v3 := 2 * v2
v4 := v1 + v2
v5 := 3 * v3� �
1
2
3 4
5
February 9, 2006 – p. 43
Skidmore College
Interference Graph Example
� �
v1 := 2 * 2
v2 := 2 * 3
v3 := 2 * v2
v4 := v1 + v2
v5 := 3 * v3� �
1
2
3 4
5
February 9, 2006 – p. 44
Skidmore College
Interference Graph Example
� �
v1 := 2 * 2
v2 := 2 * 3
v3 := 2 * v2
v4 := v1 + v2
v5 := 3 * v3� � 4
5
1
2
3
February 9, 2006 – p. 45
Skidmore College
Qu and Potkonjak
(e) Watermarked Bytecode
50 : iconst_1
53 : iload_254 : iconst_155 : isub56 : istore_257 : goto −> 12
62 : ireturn
21 : iconst_1
42 : baload43 : if_icmpeq −> 5046 : iconst_047 : goto −> 51
34 : ifne −> 60
38 : iload_237 : aload_0
39 : baload40 : aload_141 : iload_2
22 : ifne −> 33
30 : goto −> 3433 : iconst_1
13 : iconst_014 : if_icmplt −> 2117 : iconst_018 : goto −> 22
METHOD: fast_memcmp:([B[BI)Z0 : iconst_01 : istore 33 : iconst_04 : istore_35 : iconst_1
8 : iload_29 : iconst_110 : isub11 : istore_212 : iload_2
6 : istore 4
25 : iload 4
60 : iload 4
51 : istore 4
27 : invokestatic
v1
v2
v3
v4v5
v6
v7
(b) Original Interference Graph
v1
v2
v3
v4v5
v6
v7
(c) Watermarked Interference Graph
WatermarkEmbed
v7 2v6 4v5 3v4 3v3 2v2 1v1 0
variable register number(d) Register Assignment Table
(a) Original Bytecode
50 : iconst_151 : istore 353 : iload_254 : iconst_155 : isub56 : istore_257 : goto −> 1260 : iload 362 : ireturn
21 : iconst_1
42 : baload43 : if_icmpeq −> 5046 : iconst_047 : goto −> 51
34 : ifne −> 60
38 : iload_237 : aload_0
39 : baload40 : aload_141 : iload_2
22 : ifne −> 3325 : iload 3
30 : goto −> 3433 : iconst_1
13 : iconst_014 : if_icmplt −> 2117 : iconst_018 : goto −> 22
METHOD: fast_memcmp:([B[BI)Z0 : iconst_01 : istore 33 : iconst_04 : istore_35 : iconst_16 : istore 38 : iload_29 : iconst_110 : isub11 : istore_212 : iload_2
27 : invokestatic
February 9, 2006 – p. 46
Skidmore College
What is Software Birthmarking?
birthmark
B
B
BAppliesObfuscation
Copies toSell
Sells
P
P′1
P′
P′2
A technique used to address the illegal distribution of all orsome part of a program.
Extract identifying characteristics from two programs to showthat one is a copy of the other.
February 9, 2006 – p. 47
Skidmore College
How Does Birthmarking Differ From Watermarking?
It is often necessary to alter existing code or add code to theapplication in order to embed the watermark.
Birthmarks cannot prove authorship or identify the source ofan illegal redistribution.
Birthmarks only confirm that one program is a copy ofanother.
February 9, 2006 – p. 48
Skidmore College
Software Birthmarking
A Birthmarking system consists of the following functions:
extract(p) → bp
extract(q) → bq
similarity(p, q) → [0, 1]
We want to develop an algorithm such that
B is resilient to various transformations.
B is credible.
February 9, 2006 – p. 49
Skidmore College
Software Birthmarking
A Birthmarking system consists of the following functions:
extract(p) → bp
extract(q) → bq
similarity(p, q) → [0, 1]
We want to develop an algorithm such that
B is resilient to various transformations.
B is credible.
February 9, 2006 – p. 49
Skidmore College
Software Birthmarking
Idea similar to that of:
Plagiarism detection
Code clones
What makes it unique is that a birthmark is computed at themachine code level and considers semantics-preservingtransformations.
February 9, 2006 – p. 50
Skidmore College
Myles and Collberg
Compute the set of unique opcode sequences of length k for aset of modules.
k = 3 minimized the probability of false positives whilemaximizing resistance to transformations.
ACM SAC 2005.
February 9, 2006 – p. 51
Skidmore College
Myles and Collberg
k-gram where k = 2
� �Method vo id main ( j a v a . l ang . S t r i n g [ ] )
0 g e t s t a t i c #13 < F i e l d j a v a . i o . P r i n tS t r eam out>
3 new #16 <C l a s s j a v a . l ang . S t r i n gBu f f e r>
6 dup
7 l d c #18 < S t r i n g ”15! = ”>
9 i n v o k e s p e c i a l #23 <Method j a v a . l ang . S t r i n gB u f f e r ( j a v a . l ang . S t r i n g)>
12 ldc2 w #24 <Long 15>
15 i n v o k e s t a t i c #29 <Method l ong f a c t ( l ong)>
18 i n v o k e v i r t u a l #33 <Method j a v a . l ang . S t r i n gB u f f e r append ( l ong)>
21 i n v o k e v i r t u a l #37 <Method j a v a . l ang . S t r i n g t o S t r i n g ()>
24 i n v o k e v i r t u a l #40 <Method vo id p r i n t l n ( j a v a . l ang . S t r i n g)>
27 r e t u r n
� �
� �{( g e t s t a t i c , new ) ,
(new , dup ) ,
( dup , l d c ) ,
( ldc , i n v o k e s p e c i a l ) ,
( i n v o k e s p e c i a l , l dc2 w ) ,
( ldc2 w , i n v o k e s t a t i c ) ,
( i n v o k e s t a t i c , i n v o k e v i r t u a l ) ,
( i n v o k e v i r t u a l , i n v o k e v i r t u a l ) ,
( i n v o k e v i r t u a l , r e t u r n )}
� �
February 9, 2006 – p. 52
Skidmore College
Myles and Collberg
Computing similarity:
similarity(bp, bq) = |bp∩bq |
|bp|
February 9, 2006 – p. 53
Skidmore College
Birthmarks and Watermarks
Birthmarks provide weaker evidence than software watermarks.
They can indicate that one program is likely to be a copyof another, but not who the original author is or who isguilty of piracy.
Birthmarks can be used in instances where watermarking isnot feasible
Birthmarks can be used in conjunction with watermarking toprovide stronger evidence of theft.
There are instances where watermarks are destroyed butbirthmarks are not.
February 9, 2006 – p. 54
Skidmore College
Summary
The IP in software is threatened in several different ways.
There are a variety of techniques to address the differentissues.
The goal behind software-based techniques is to require“enough” time, effort, and/or resources to break such that itis less costly for the attacker to simply rewrite the software orpurchase legal copies.
Currently no single mechanism exists to prevent all threethreats.
By combining the techniques, a stronger defense whichprovides multiple levels of protection can be achieved.
February 9, 2006 – p. 55
Skidmore College
Questions?
February 9, 2006 – p. 56
Skidmore College
Arboit Algorithm 2
k branching points throughout the application are randomlyselected.
At each branching point, MTbi
or MFbi
is created and a method
call is appended.
The bits of the watermark are encoded in the opaque methodthrough the opaque predicate that it evaluates.
February 9, 2006 – p. 57
Skidmore College
Arboit Algorithm 2 Example
� �
c l a s s C{
vo id m1( i n t a , i n t b ){
. . .
i f ( a <= b ) { . . . }
e l s e { . . . }
. . .
}
}� �
W⇒
� �
c l a s s C{
boolean m2(){
i n t c = 1 ;
r e t u r n ( c∗c >= 0);
}
vo id m1( i n t a , i n t b ){
. . .
i f ( ( a <= b) && m2 ( ) ) { . . . }
e l s e { . . . }
. . .
}
}� �
February 9, 2006 – p. 58
Skidmore College
Tamada et al.
Specific to Java class files.
Composed of 4 individual birthmarks
constant value in field variables
sequence of method calls
inheritance structure
used classes
IASTED International Conference on Software Engineering2004.
February 9, 2006 – p. 59
Skidmore College
Tamada et al.
Constant value in field values:� �
pub l i c c l a s s Sk inPane l extends JPane l
implements SandMarkGUIConstants
{
p r i v a t e Image image ;
p r i v a t e i n t imgWidth = −1;
p r i v a t e i n t imgHeight = −1;
p u b i l c Sk inPane l (){
s e tLayou t ( n u l l ) ;
se tBackground (new Co lo r (0 xe8d5bd ) ) ;
image = Too lK i t . g e tDe f a u l tToo lK i t ( ) . get Image (
g e tC l a s s ( ) . g e tC l a s s Load e r ( ) . g e tRe sou r c e (SAND IMAGE ) ) ;
i f ( image != n u l l ){
MediaTracker med = new MediaTracker ( t h i s ) ;
med . addImage ( image , 0 ) ;
t r y{
med . wa i t F o rA l l ( 10000 ) ;
}catch ( Excep t i on e){
throw new Runt imeExcept ion ( ex ) ;
}
wh i l e ( ( imgWidth = image . getWidth ( n u l l ) == −1 ||
( imgHeigth = image . g e tHe i gh t ( n u l l ) == −1);
}
}
}
� �
⇒� �
( j a v a . awt . Image , n u l l )
( i n t , −1)
( i n t , −1)
� �
February 9, 2006 – p. 60
Skidmore College
Tamada et al.
Sequence of method calls:
� �pub l i c c l a s s Sk inPane l extends JPane l implements SandMarkGUIConstants {
p r i v a t e Image image ; i n t imgWidth = −1; i n t imgHeight = −1;
p u b i l c Sk inPane l (){
s e tLayou t ( n u l l ) ; se tBackground (new Co lo r (0 xe8d5bd ) ) ;
image = Too lK i t . g e tDe f a u l tToo lK i t ( ) . get Image ( g e tC l a s s ( ) . g e tC l a s s Load e r ( ) . g e tRe sou r ce (SAND IMAGE ) ) ;
i f ( image != n u l l ){
MediaTracker med = new MediaTracker ( t h i s ) ; med . addImage ( image , 0 ) ;
t r y{
med . wa i t F o rA l l ( 10000 ) ;
}catch ( Excep t i on e){
throw new Runt imeExcept ion ( ex ) ;
}
wh i l e ( ( imgWidth = image . getWidth ( n u l l ) == −1 || ( imgHeigth = image . g e tHe i gh t ( n u l l ) == −1);
}
}
}
� �
� �j a v a x . swing . JPane l.< i n i t >() , j a v a . awt . Co lo r .< i n i t >( i n t ) , j a v a . awt . Too lK i t j a v a . awt . Too l k i t . g e tDe f au l tTookk i t ( ) ,
C l a s s Ojec t . g e tC l a s s ( ) , C l a s sLoade r C l a s s . g e tC l a s s Load e r ( ) , j a v a . net .URL C l a s sLoade r . g e tRe sou r c e ( S t r i n g ) ,
j a v a . awt . Image j a v a . awt . Too l k i t . get Image ( j a v a . net .URL) , j a v a . awt . MediaTracker.< i n i t >(j a v a . awt . Component ) ,
vo id j a v a . awt . MediaTracker . addImage ( j a v a . awt . Image , i n t ) , boolean j a v a . awt . MediaTracker . w a i t F o rA l l ( l ong ) ,
Runt imeExcept ion.< i n i t >(Throwable ) , i n t j a v a . awt . Image . getWidth ( j a v a . awt . image . ImageObserver ) ,
i n t j a v a . awt . Image . g e tHe i gh t ( j a v a . awt . image . ImageObserver ) ,
� �
February 9, 2006 – p. 61
Skidmore College
Tamada et al.
Inheritance structure:� �
pub l i c c l a s s Sk inPane l extends JPane l
implements SandMarkGUIConstants
{
p r i v a t e Image image ;
p r i v a t e i n t imgWidth = −1;
p r i v a t e i n t imgHeight = −1;
p u b i l c Sk inPane l (){
s e tLayou t ( n u l l ) ;
se tBackground (new Co lo r (0 xe8d5bd ) ) ;
image = Too lK i t . g e tDe f a u l tToo lK i t ( ) . get Image (
g e tC l a s s ( ) . g e tC l a s s Load e r ( ) . g e tRe sou r c e (SAND IMAGE ) ) ;
i f ( image != n u l l ){
MediaTracker med = new MediaTracker ( t h i s ) ;
med . addImage ( image , 0 ) ;
t r y{
med . wa i t F o rA l l ( 10000 ) ;
}catch ( Excep t i on e){
throw new Runt imeExcept ion ( ex ) ;
}
wh i l e ( ( imgWidth = image . getWidth ( n u l l ) == −1 ||
( imgHeigth = image . g e tHe i gh t ( n u l l ) == −1);
}
}
}
� �
⇒� �
j a v a x . swing . JPanel ,
j a v a x . swing . JComponent ,
j a v a . awt . Conta ine r ,
j a v a . awt . Component ,
j a v a . l ang . Object
� �
February 9, 2006 – p. 62
Skidmore College
Tamada et al.
Used classes:� �
pub l i c c l a s s Sk inPane l extends JPane l
implements SandMarkGUIConstants
{
p r i v a t e Image image ;
p r i v a t e i n t imgWidth = −1;
p r i v a t e i n t imgHeight = −1;
p u b i l c Sk inPane l (){
s e tLayou t ( n u l l ) ;
se tBackground (new Co lo r (0 xe8d5bd ) ) ;
image = Too lK i t . g e tDe f a u l tToo lK i t ( ) . get Image (
g e tC l a s s ( ) . g e tC l a s s Load e r ( ) . g e tRe sou r c e (SAND IMAGE ) ) ;
i f ( image != n u l l ){
MediaTracker med = new MediaTracker ( t h i s ) ;
med . addImage ( image , 0 ) ;
t r y{
med . wa i t F o rA l l ( 10000 ) ;
}catch ( Excep t i on e){
throw new Runt imeExcept ion ( ex ) ;
}
wh i l e ( ( imgWidth = image . getWidth ( n u l l ) == −1 ||
( imgHeigth = image . g e tHe i gh t ( n u l l ) == −1);
}
}
}
� �
⇒
� �j a v a . awt . Co lor ,
j a v a . awt . Component ,
j a v a . awt . Image ,
j a v a . awt . image . ImageObserver ,
j a v a . awt . MediaTracer ,
j a v a . awt . Too l k i t ,
j a v a . l ang . C la s s ,
j a v a . l ang . C la s sLoade r ,
j a v a . l ang . Oject ,
j a v a . l ang . Runt imeExcept ion ,
j a v a . l ang . S t r i ng ,
j a v a . l ang . Throwable ,
j a v a . net .URL ,
j a v a x . swing . JPane l
� �
February 9, 2006 – p. 63
Skidmore College
Tamada et al.
Computing similarity:
CVFV: similarity(bp, bq) = |bp∩bq |
|bp|
SMC, IS, UC : similarity(bq, bq) = |LCS(bp,bq)|
|bp|
February 9, 2006 – p. 64