Deobfuscation of Virtualization-Obfuscated Software

25
KEVIN COOGAN, GEN LU, SAUMYA DEBRAY DEPARTMENT OF COMUPUTER SCIENCE UNIVERSITY OF ARIZONA 報報報 報報報 Deobfuscation of Virtualization- Obfuscated Software ADLab 1

description

Deobfuscation of Virtualization-Obfuscated Software. Kevin Coogan , Gen Lu, saumya debray Department of Comuputer Science University of Arizona 報告者:張逸文. Outline. Introduction Deobfuscation Experimental Evaluation Related Work Conclusion. Introduction ( 1/4 ). - PowerPoint PPT Presentation

Transcript of Deobfuscation of Virtualization-Obfuscated Software

Page 1: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

1

K E V I N C O O G A N , G E N L U , S A U M YA D E B R AYD E PA RT M E N T O F C O M U P U T E R S C I E N C E

U N I V E R S I T Y O F A R I Z O N A

報告者:張逸文

Deobfuscation of Virtualization-Obfuscated Software

Page 2: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

2

Outline

IntroductionDeobfuscationExperimental EvaluationRelated WorkConclusion

Page 3: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

3

Introduction( 1/4)Basic about Reverse Engineering

Compilation

Decompilation

Page 4: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

4

Introduction( 2/4)Virtualization obfuscators

VMProtect, Code Virtualizer{

VIRTUALIZER_STARTyour codeVIRTUALIZER_END

}

Page 5: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

5

Introduction( 3/4)The virtualization-obfuscated programs are resistant to

static and dynamic analysis techniques The executed code reveals only the structure and logic of the byte-

code interpreter Randomness VM

Outside-in approach Reverse engineer the VM interpreter Individual byte code instructions Recover the logic The structure of the interpreter meets certain requirements

Page 6: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

6

Introduction( 4/4)Programs interact with the system through system callsIdentifying instructions that interact with the systemNot recovering the original instructionsCapturing behavior of the codeGeneral, using in a wide range

Page 7: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

7

Deobfuscation

Static analysis v.s dynamic traceIdentifying instructions that are known to be part of the

original codeNo information about the specific structure of the

interpreter

Page 8: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

8

Deobfuscation

Overall approach:1. Tracing tool

Low level execution trace

2. Identifying system calls and their arguments database

3. Instruction trace Relevant instructions

4. Building a subtrace Relevant subtrace

Page 9: Deobfuscation  of Virtualization-Obfuscated Software

9

Deobfuscation

ADLab

Value-based Dependence Analysis Not recovering the original code The process of deobfuscation must be semantics-preserving Identifying instructions that affect the values of the arguments to

system calls Slicing algorithms --- control-dependent Data dependencies Use-definition chains --- link instructions that use a variable to the

instruction that define it Problem:

Page 10: Deobfuscation  of Virtualization-Obfuscated Software

10

Deobfuscation

ADLab

Value-based dependenceif( I defines a location l S) {I is marked as relevant;l is removed from S;the set of locations used by I is added to S; }

Problem: a pointer to a structureI uses some locations l1, l2, … , ld

if ( I uses li P to define ld ) ld is added to P

if ( li access a memory location )[li ] is added to M

Page 11: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

11

Deobfuscation

Relevant Conditional Control Flow Value-based dependence analysis doesn’t identify the associated

control flow instructions The occurring of conditional control flow IA-32 architecture setting the condition code flags in the eflags

register Not such simple!! Examining target address Equational Resoning System: translate each instruction in the

dynamic trace into an equivalent set of equations

Page 12: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

12

Deobfuscation

Equational Resoning System Identifies conditional dependencies The left hand side variables in an equation is numbered by the order of

its instruction appears The right hand side variables is numbered by the instruction that defined

it Example 1.

Page 13: Deobfuscation  of Virtualization-Obfuscated Software

13

Deobfuscation

ADLab

Example 2. Example 3. Indirect jump

Page 14: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

14

Deobfuscation

Example 4. Used in VMProtect

Target20 = index1*4+0x10000

Page 15: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

15

Deobfuscation

Page 16: Deobfuscation  of Virtualization-Obfuscated Software

16

Deobfuscation

ADLab

Page 17: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

17

Deobfuscation

Relevant Call-Return Control Flow Identifying functions: the behavior of calls and returns Knowing how them work allows one to use for other purposes Behavior of Function Calls and Returns

Page 18: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

18

Deobfuscation

registers

call 改成 push

無法解決

Page 19: Deobfuscation  of Virtualization-Obfuscated Software

19

Deobfuscation

ADLab

Identification Approach Call: a code address is saved at the call site Return: the saved address is used for a control transfer at the return

point

Page 20: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

20

Deobfuscation

Relevant Dynamic Trace

Page 21: Deobfuscation  of Virtualization-Obfuscated Software

21

Experimental Evaluation

ADLab

Experimental Methodology Compile original source code Generate an original dynamic trace Build an original subtrace Virtualization-obfuscation technique Generate an obfuscated dynamic trace Build a relevant subtrace of the obfuscated subtrace The obfuscated subtrace is matched to the original subtrace and

scores are produced The relevance score and obfuscation score are calculated

Page 22: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

22

Experimental Evaluation

VX Heavens website

Page 23: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

23

Related Work

Deobfuscation of code obfuscated via virtualization obfuscators Rolles, Sharif, Falliere

Programming language community Partial evaluation

Page 24: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

24

Conclusions

Virtualization-obfuscated programs are difficult to reverse engineer

We present a different approach to identifying the flow of values to system call instructions

Page 25: Deobfuscation  of Virtualization-Obfuscated Software

ADLab

25

XD ~