Decompilation of .NET bytecode
-
Upload
alessandra-tandy -
Category
Documents
-
view
14 -
download
1
description
Transcript of Decompilation of .NET bytecode
Decompilation of .NET bytecode
Stephen Horne
Trinity Hall
10th February 2004
Computer Science Part II Project Progress Report
http://hal.trinhall.cam.ac.uk/~srh38/project
Slide 2 Decompilation of .NET bytecode
The .NET framework
C# compiler
J# compiler
Managed C++compiler
VB .NETcompiler
ManagedC++
VB .NET
J#
C#
CIL andMetadata
Common Language Runtime
.NET and the Common Language Runtime
• Microsoft’s answer to Java
• CLR is .NET equivalent of the JVM
• Lots of useful metadata provided in assemblies
What about reversing the compilation process?
• Sometimes we want to recover source from a binary
– Language translation
– Lost source recovery
– Checking for malicious code
• Obvious legal and ethical ramifications
Slide 3 Decompilation of .NET bytecode
ExecutableExecutable
DecompilerDecompiler
SourceSource
Frontend
UDM
Backend
Low-level intermediate code
Unstructured control-flow graph
Structured control-flow graph
High-level intermediate code
Structure of a decompiler
• Reads in bytecode
• Divides into basic blocks
• Data-flow analysis
• Control-flow analysis
• Code generation
Slide 4 Decompilation of .NET bytecode
IL_002b: ldloc.2IL_002c: ret
Example decompilation
CIL bytecode
1
9
8
2
3
4 5
6
7
• Divide code into basic blocks and create CFG
• Data-flow analysis
– Register copy propogation
• Control-flow analysis
– Divide graph into intervals
– Loops induced by back-edges within intervals
– Nesting of intervals nesting of loops
– Conditionals found by common follow nodes
– Order of nodes nesting of conditionals
• Generate code from structured CFG
Control-flow graph
IL_0000: ldc.i4.0IL_0001: stloc.0IL_0002: ldc.i4.0IL_0003: stloc.1IL_0004: br.s IL_0023
IL_0006: ldc.i4.3IL_0007: ldloc.1IL_0008: mulIL_0009: ldarg.0IL_000a: bge.s IL_0012
IL_000c: ldloc.0IL_000d: ldc.i4.1IL_000e: subIL_000f: stloc.0IL_0010: br.s IL_0016
IL_0012: ldloc.0IL_0013: ldc.i4.1IL_0014: addIL_0015: stloc.0
IL_001f: ldloc.1IL_0020: ldc.i4.1IL_0021: addIL_0022: stloc.1
IL_0027: ldloc.0IL_0028: stloc.2IL_0029: br.s IL_002b
1
3
4
7
2
8
9
IL_0023: ldloc.1IL_0024: ldarg.0IL_0025: blt.s IL_0006
IL_0016: ldloc.0IL_0017: call Math::Abs(int32)IL_001c: ldloc.1IL_001d: blt.s IL_0006
5
6
Process
Entry
Exit
Slide 5 Decompilation of .NET bytecode
Current status
public static int ControlExample(int x) { int y = 0;
for(int i = 0; i < x; i++) { do { if(3 * i < x) y--; else y++; } while(Math.Abs(y) < i); }
return y;}
public static Int32 ControlExample(Int32 x) { Int32 local0; Int32 local1; Int32 local2; local0 = 0; local1 = 0; while (local1 < x) { do { if (((3 * local1) < x)) { local0 = (local0 - 1); } else { local0 = (local0 + 1); } } while (Math.Abs(local0) < local1); local1 = (local1 + 1); } local2 = local0; return local2;}
Original
Decompiled
Features implemented:• Analysis for basic conditional and looping structures
• Control flow graph generation
• C# code generation
• Almost half the CIL instruction set
• Decompiles very basic applications
Remaining tasks (lots!):• Local variable names
• Basic language features (arrays, switching, breaks etc.)
• Advanced features (custom indexers, operator overloading, properties)
• Object oriented features
Extensions:• Decompilation for other stack-based architectures (e.g. Java)
• Code generation for other languages (e.g VB .NET)
• Graphical user interface