Decompilation of .NET bytecode

5
Decompilation of .NET bytecode Stephen Horne Trinity Hall 10 th February 2004 Computer Science Part II Project Progress Report http://hal.trinhall.cam.ac.uk/~srh38/project

description

Decompilation of .NET bytecode. Stephen Horne Trinity Hall. Computer Science Part II Project Progress Report. http://hal.trinhall.cam.ac.uk/~srh38/project. 10 th February 2004. The .NET framework. .NET and the Common Language Runtime Microsoft’s answer to Java - PowerPoint PPT Presentation

Transcript of Decompilation of .NET bytecode

Page 1: Decompilation of .NET bytecode

Decompilation of .NET bytecode

Stephen Horne

Trinity Hall

10th February 2004

Computer Science Part II Project Progress Report

http://hal.trinhall.cam.ac.uk/~srh38/project

Page 2: Decompilation of .NET bytecode

Slide 2 Decompilation of .NET bytecode

The .NET framework

C# compiler

J# compiler

Managed C++compiler

VB .NETcompiler

ManagedC++

VB .NET

J#

C#

CIL andMetadata

Common Language Runtime

.NET and the Common Language Runtime

• Microsoft’s answer to Java

• CLR is .NET equivalent of the JVM

• Lots of useful metadata provided in assemblies

What about reversing the compilation process?

• Sometimes we want to recover source from a binary

– Language translation

– Lost source recovery

– Checking for malicious code

• Obvious legal and ethical ramifications

Page 3: Decompilation of .NET bytecode

Slide 3 Decompilation of .NET bytecode

ExecutableExecutable

DecompilerDecompiler

SourceSource

Frontend

UDM

Backend

Low-level intermediate code

Unstructured control-flow graph

Structured control-flow graph

High-level intermediate code

Structure of a decompiler

• Reads in bytecode

• Divides into basic blocks

• Data-flow analysis

• Control-flow analysis

• Code generation

Page 4: Decompilation of .NET bytecode

Slide 4 Decompilation of .NET bytecode

IL_002b: ldloc.2IL_002c: ret

Example decompilation

CIL bytecode

1

9

8

2

3

4 5

6

7

• Divide code into basic blocks and create CFG

• Data-flow analysis

– Register copy propogation

• Control-flow analysis

– Divide graph into intervals

– Loops induced by back-edges within intervals

– Nesting of intervals nesting of loops

– Conditionals found by common follow nodes

– Order of nodes nesting of conditionals

• Generate code from structured CFG

Control-flow graph

IL_0000: ldc.i4.0IL_0001: stloc.0IL_0002: ldc.i4.0IL_0003: stloc.1IL_0004: br.s IL_0023

IL_0006: ldc.i4.3IL_0007: ldloc.1IL_0008: mulIL_0009: ldarg.0IL_000a: bge.s IL_0012

IL_000c: ldloc.0IL_000d: ldc.i4.1IL_000e: subIL_000f: stloc.0IL_0010: br.s IL_0016

IL_0012: ldloc.0IL_0013: ldc.i4.1IL_0014: addIL_0015: stloc.0

IL_001f: ldloc.1IL_0020: ldc.i4.1IL_0021: addIL_0022: stloc.1

IL_0027: ldloc.0IL_0028: stloc.2IL_0029: br.s IL_002b

1

3

4

7

2

8

9

IL_0023: ldloc.1IL_0024: ldarg.0IL_0025: blt.s IL_0006

IL_0016: ldloc.0IL_0017: call Math::Abs(int32)IL_001c: ldloc.1IL_001d: blt.s IL_0006

5

6

Process

Entry

Exit

Page 5: Decompilation of .NET bytecode

Slide 5 Decompilation of .NET bytecode

Current status

public static int ControlExample(int x) { int y = 0;

for(int i = 0; i < x; i++) { do { if(3 * i < x) y--; else y++; } while(Math.Abs(y) < i); }

return y;}

public static Int32 ControlExample(Int32 x) { Int32 local0; Int32 local1; Int32 local2; local0 = 0; local1 = 0; while (local1 < x) { do { if (((3 * local1) < x)) { local0 = (local0 - 1); } else { local0 = (local0 + 1); } } while (Math.Abs(local0) < local1); local1 = (local1 + 1); } local2 = local0; return local2;}

Original

Decompiled

Features implemented:• Analysis for basic conditional and looping structures

• Control flow graph generation

• C# code generation

• Almost half the CIL instruction set

• Decompiles very basic applications

Remaining tasks (lots!):• Local variable names

• Basic language features (arrays, switching, breaks etc.)

• Advanced features (custom indexers, operator overloading, properties)

• Object oriented features

Extensions:• Decompilation for other stack-based architectures (e.g. Java)

• Code generation for other languages (e.g VB .NET)

• Graphical user interface