EECS 354Network Security
Reverse Engineering
Reverse Engineering
Introduction
Preventing Reverse Engineering
Reversing High Level Languages
Reversing an ELF Executable
Anything is possible
There is no computer system in existence that cannot be reverse engineered
Most important limiting factorsComplexity
Time
Reversing by LanguageRuby, javascript, HTML, etc
Not compiled
Python, Java, C#, VB.NET, etcByte compiled
Easier to decompile/inspect
Many symbols still exist in bytecode
C, C++Compiled into machine code
Much harder to decompile
Still possible to reverse engineer with debugger and disassembler
Scalability of techniques
Basic reversing techniques work for small code bases
It’s possible to determine what assembly code does for a 100 line C program without too much difficulty
Not used heavily by hackersWhen trying to hack an application, crashes and error messages are better hints
Windows
Is it possible to reverse engineer Windows?
How many lines of code does it have?
How long would it take?
Wine’s reverse engineering
The Wine project attempts to implement the windows API
Project began in 1993, still unstable and incomplete
Has over 1.4 million lines of code (written by 700 contributors)
Does not cover all of Windows (core OS, windowing, etc)
On the other hand, Samba (reverse engineering Windows file sharing) has been pretty successful
Why Reverse Engineering?
DefenseSecurity companies often reverse malware binaries
Protocol reversing for botnet analysis
Working with proprietary APIs or protocols
HackingFinding vulnerabilities is easier with the code
Introduction
Preventing Reverse Engineering
Reversing High Level Languages
Reversing an ELF Executable
Preventing reverse engineering
ObfuscationTranslate code into something unreadable or unnatural
Must trick a human reader without tricking the machine interpreter/loader
Reverse engineering, besides in the most basic form, is combating software obfuscation
Obfuscation TechniquesRenaming functions/variables
Adding bogus code with no side-effects
Remove whitespace
Make strings/numbers hex values
Using “dynamic” codeJavascript: eval
Java: GetName, GetAttribute
Python: getattr, setattr
Most of these are reversibleExcept function/variable names can’t be recovered
Obfuscation Techniques
PackingStoring an executable as a string (or otherwise) within an executable
Can make use of compression and encryption to hide contents
Decompression or decryption code must be packed in the executable as well
Complex packers exist for most languages
Javascript Obfuscation
Javascript Obfuscation
<script>eval(unescape('%3C%64%69%76%20%73%74'))</script>
<script>a = ‘t’; b = ‘er’; c = ‘a’; d = eval; e = ‘\”XSS\”’; d(c+'l'+b+a+'('+e+')'); </script>
Introduction
Preventing Reverse Engineering
Reversing High Level Languages
Reversing an ELF Executable
What is byte code?Byte code is compiled code that cannot be executed by the processor
Distinct from machine code
Architecture independent
Executed by a software interpreter: a VM, a JIT compiler, etc
Byte code is often dynamicSymbols can be referenced at runtime
This means the program structure still exists, can be rebuilt
DecompilersDecompilers reverse the steps taken by a compiler
Opcode translation
Abstract Syntax Tree construction
PythonUncompyle2, decompyle, unpyc
JavaJad, JD
Reversing Basics
Preventing Reverse Engineering
Reversing High Level Languages
Reversing an ELF Executable
ExecutablesMachine code is changed significantly from the original source code
Variables have been allocated to registers or somewhere in memory
Optimization steps have changed the program structure
No way to decompile this back to the original source
Machine instructions translate directly to assembly code
Disassembly analysis can be effective
Reversing Executables
We will be focusing on x86 32-bit LSB ELF executables
Contains ELF header, program header, section table, and data
May also contain a symbol table
Reversing Executables
ELF Header contains program entry point, basic identifying information
Program header describes memory segments (e.g. where in memory will segments be loaded? what parts of memory are r/w/x?)
Used at program load time
Section table describes section layout (e.g. where’s the .rodata? .text? .bss?)
Used at link time
X86 Assembly
mov
add, sub shl, shr, sar, mul, div
and, or, xor
jmp, je, jne, jl, jg, jle, jge
cmp, test
call, push, pop, ret, nop
0x8(%esp), -0xc(%ebp)
Reversing BasicsBasic tools:
file
strings
strace (and ltrace)
nm
objdump or readelf
tcpdump
gdb
You can reverse anything with a good debugger, but…
Reversing Frameworks
For more advanced reversing, it may help to have more than just a debugger
IDA
Radare
ELF Obfuscation
There are some additional techniques for obfuscating executable formats:
Storing data in unusual sections: .ctors, .dtors, .init, etc
“Corrupting” the ELF header
Stripping the symbol table
Checking ptrace to prevent debuggers
Packing
Code is unpacked dynamically during execution
Malware Examples
Demo...
Source: http://crackmes.de/users/synamics/xrockmr/
Top Related