PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary...
Transcript of PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary...
![Page 1: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/1.jpg)
PANDEMONIUM:
Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing
Yuma Kurogome
CODE BLUE 2015 [U-25]
2015.10.29
1
This material is partially based upon work supported by
Asian Office of Aerospace Research and Development,
U.S. Air Force Office of Scientific Research under Award No. FA2386-15-1-4068.
![Page 2: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/2.jpg)
$ whoami
2
• Yuma Kurogome(@ntddk)
• ntddk.github.io
Peer reviewSecurity Camp lecturer AVTOKYO speaker
![Page 3: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/3.jpg)
Abstract
• Malware utilize many cryptographic algorithms• To conceal messages and configurations
• DBI(Dynamic Binary Instrumentation)• Dynamic analysis on PANDA(QEMU)• Translate x86 code to LLVM IR(Intermediate representation) per
BB(Basic Block)• Remove obfuscated code by optimization
• Fuzzy hash based pattern matching• Detect and avoid anti-analysis code• Identify cryptographic algorithms from the similarity of handling
received data
3
One entry, one exit
![Page 4: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/4.jpg)
Malware and crypto-algorithms
4
Malware utilize many crypto-algorithms
to conceal messages and configurations
• Banking trojan• Decrypt configuration files
• Ransomware• Encrypt victim files
We deal with banking trojan in this researchs
Server(C&C) has key
Key is hardcoded in own body
![Page 5: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/5.jpg)
Evolution of banking trojan
5
Malware come to birth one after
another from the black market
• Many variants were born from leaked Zeus• Citadel• IceIX• GameOver• KINS
• New spiecies have also been born• Dyre• Vawtrak• Chthonic
http://www.wontok.com/wp-content/uploads/2014/10/wdt0185_MalwareTimeline_largeV2.jpg
![Page 6: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/6.jpg)
Banking trojan and crypto-algorithms
6
Many banking trojan utilize encrypted
configuration files and commands
• Ex. Communication between Dyre and C&C
We have to identify crypto-algorithms promptly
……
Key + IV
Encrypted data
![Page 7: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/7.jpg)
Related work (1/2)
7
Identify crypto-algorithms by paying
attention to the arithmetic/bit operations
• Dispatcher[CCS’09]• Find crypto-routines from insns ratio between call and ret insns
• Impossible to find if crypto-routines are made of multiple subroutines
• ReFormat[ESORICS’09]• Find crypto-routines from the peak in the overall execution log
• Impossible to find if multiple algorithms are implemented
![Page 8: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/8.jpg)
Related work (2/2)
8
Identify crypto-algorithms by paying
attention to the loop structures
• Aligot[CCS’11]• Extract the input of the loop structures, and give it to known algorithms
implementation
• If output is same, algorithm is same
• The amount of calculation is O(n^2) a lot, it can only extract known crypto-algorithm
• Kerckhoffr[RAID’11]• Extract the input of the loop structures, and compare with known algorithms
signatures
• If pattern is matched, regard as crypto-routines
• Can only extract known crypto-algorithm
![Page 9: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/9.jpg)
Downside of related work
9
Method Known algorithms Unknown algorithms Anti anti-analysis
Dispatcher ☓
ReFormat ☓
Aligot ☓ ☓
Kerckhoffr ☓ ☓
• Previous approaches assumes execution log is infallible
• PANDEMONIUM can analyze if malware has anti-analysis routines and has been obfuscated
![Page 10: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/10.jpg)
Anti-analysis
10
Many malware try to detect debugger
and sandbox to avoid analysis
••
•
•
•
•
•
we cannot often obtain expected analysis results
![Page 11: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/11.jpg)
There is no silver bullet
11
Analysis platform hasn’t been able to follow
complex technique of malware
•
••
•
•
We need extensible analysis platform
![Page 12: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/12.jpg)
PANDEMONIUM
Avoid anti-analysisNetwork
communication
Remove obfuscated
code
Identify crypto-
algotiyhms
12
Combine different approaches to identify
decrypt-routines of malware
PANDA
Guest OS malware LLVM IR Analysis log
PANDEMONIUM
Dynamic analysis Static analysis
![Page 13: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/13.jpg)
Emulation by QEMU
• TCG(Tiny Code Generator)
13
1. Disassemble target code, and create BB(Basic Block) separated by branch insns
2. Translate BB to RISC-like TCG IR
3. Translate TCG IR to host code
4. Build chain of translated BBs and execute
![Page 14: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/14.jpg)
PANDA[REcon’14]
• DBI(Dynamic Binary Instrumentation)
14
1. Disassemble target code, and create BB(Basic Block) separated by branch insns
2. Translate BB to RISC-like TCG IR
3. Translate TCG IR to LLVM IR
4. Translate TCG IR to host code
5. Build chain of translated BBs and execute
1. 2. 3.push esppush ebppush ebx
movi_i64 tmp12,$0x8260a634st_i64 tmp12,env,$0xdae0ld_i64 tmp12,env,$0xdad0
Can apply taint analysis and symbolic executionCallback before/after translation
We can obtain LLVM IR corresponded to malware code
%2 = add i64 %env_v, 128%3 = inttoptr i64 %2 to i64*store i64 2187372084, i64* %3
github.com/moyix/panda
![Page 15: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/15.jpg)
Extract decrypt-routines (1/5)
15
Combine different approaches to identify
decrypt-routines of malware
OS
MalwareObfuscated code
Anti-analysis routine
Handler to received data
……
Decrypt-routine
Obfuscated code
![Page 16: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/16.jpg)
16
EPROCESS
ActiveProcessLi
nks
PEB
Flink
Blink
EPROCESS
ActiveProcessLi
nks
PEB
Flink
Blink
EPROCESS
ActiveProcessLi
nks
PEB
Flink
Blink…
PsActiveProcess
Head
Flink
Blink
FS:[0x30]
KPCR
KdVersionBlock
FS:[0x1c] KDEBUGGER_DATA32
PsLoadedModuleList
+0x34 +0x70
+0x78
EPROCESS is generated when process created
panda/qemu/panda_plugins/
osi_winxpsp3x86/osi_winxpsp3x86.cpp
Extract malware process from running guest OS
(Register is different from the Windows 7 or later)
Expand
![Page 17: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/17.jpg)
Extract decrypt-routines (2/5)
17
Combine different approaches to identify
decrypt-routines of malware
MalwareObfuscated code
Anti-analysis routine
Handler to received data
……
Decrypt-routine
Obfuscated code
![Page 18: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/18.jpg)
LLVM (1/2)
18
Optimization pass of LLVM can remove
some obfuscated code
x86
FrontendPANDA
TCG IR
LLVM IR
llvm.org
![Page 19: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/19.jpg)
Remove obfuscated code
19
Optimization pass of LLVM can remove
some obfuscated code
• Insert dead/nop equivalent insns• -dse, -simplifycfg
• Substitute with equivalent insns/Reorder insns• -constprop
• -instcombine
Absorb difference of insns by implementation of compiler
(x = 14; y = x + 8) → (x = 14; y = 22)
(y = 3; ...; y = x + 1) → (...; y = x + 1)
(y = x + 2; z = y + 3) → (z = x + 5)
Cf. opticode.coseinc.com
![Page 20: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/20.jpg)
Extract decrypt-routines (3/5)
20
Combine different approaches to identify
decrypt-routines of malware
Malware
Anti-analysis routine
Handler to received data
……
Decrypt-routine
Obfuscated code
![Page 21: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/21.jpg)
Anti-emulation
21
••
•
•
•
We also have to consider anti-emulation
![Page 22: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/22.jpg)
Fuzzy hashing (1/2)
22
Techniques for identifying the data
that are partially different but similar
• ssdeep• World leading security researchers will come together for this unique
international conference in Tokyo• Bb7g86hvE/
• W0rld leading security researchers will come together for this unique international conference in Tokyo• GT7g86hvE/
Create signature of some anti-analysis and crypto-algorithms
![Page 23: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/23.jpg)
Fuzzy hashing (2/2)
23
Techniques for identifying the data
that are partially different but similar
• Create fuzzy hash per BB• Normalize operand
• Anti-analysis• NtDelayExecution(), WaitForSingleObject(), GetCursorPos(),……
• Crypto-algorithms• MD5, DES, RC4, ……
Create signature of some anti-analysis and crypto-algorithms
From Beecrypt, Crypto++, OpenSSL
![Page 24: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/24.jpg)
LLVM (2/2)
24
Modify TCG IR based on pattern matching
of LLVM IR before execution
x86
FrontendPANDA
TCG IR
LLVM IR Fuzzy hash table
Feedback
Pattern matching
llvm.org
(Red-black tree)
![Page 25: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/25.jpg)
Symbolic execution (1/2)
25
Technique for extracting path constraints
through operation of symbolic variables
cmp eax, 0x7DFje 0xdeadbaad
if(x!=2015) Invalid.ASSERT( INPUT_*_*_* =0hex7DF );
Source code Trace log Conterexample
2015 affect the branch
![Page 26: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/26.jpg)
Symbolic execution (2/2)
26
Technique for extracting path constraints
through operation of symbolic variables
mov esi, 0x13mov edx, 0x7DF
• Insns must be SSA(Static Single Assignment) form• On x86, Assignment may collide
mov esi, 0x13…mov esi, 0x7DF
(esi == 0x13) and (edx == 0x7DF)
(esi == 0x13) and (esi == 0x7DF)
LLVM IR is suitable for symbolic execution
![Page 27: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/27.jpg)
Anti anti-analysis
27
static inline int IsSleepPatched(){DWORD time1 = GetTickCount();Sleep(500);DWORD time2 = GetTickCount();if ((time2- time1) > 450)
return 0;else
return 1;}
Avoid anti-analysis code which matched
pattern by using symbolic execution
• Ex. Avoid patch detection of Sleep()•
• RDTSC, GetTickCount(), ……
• Which branch to go?1. Get snapshot2. Rewrite branch constraints3. Long-lasting branch is taken
Or the number of expected clock is spent
(Check 50 insns)
![Page 28: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/28.jpg)
Extract decrypt-routines (4/5)
28
Combine different approaches to identify
decrypt-routines of malware
Malware
Handler to received data
……
Decrypt-routine
Obfuscated code
![Page 29: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/29.jpg)
VMM
Taint analysis (1/2)
29
mov eax, edx
Guest OS
Technology that analyzes dependencies
between data from propagation of tag
![Page 30: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/30.jpg)
Taint analysis (2/2)
30
Handler BB of received data from virtual
NIC would be contain decrypt-routines
• Taint source(origin of tags)• Virtual NIC
• Taint sink(check position of tags)• End of BB
• Propagation rule• Reference of register and memory
r3 = Load(r2) tr3 = tr2
![Page 31: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/31.jpg)
Anti taint analysis
31
Obfuscation technique that causes
interrupting the propagation of taint tag
• Under-tainting• Data is not assigned directly
But we have LLVM
x = get_input();if (x == "a"){
uri = "c2.php";msg = "a";
}send(uri, msg);
x = get_input();if (x > "a"){
tmp = x + "a"; msg = tmp − x;
} send(uri, msg);
-early-cse,-constprop,-instcombine
![Page 32: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/32.jpg)
Extract decrypt-routines (5/5)
32
Combine different approaches to identify
decrypt-routines of malware
Malware
Handler to received data
……
Decrypt-routine
![Page 33: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/33.jpg)
Now what?
33
Handler BBs of received data from virtual
NIC would be contain decrypt-routines
Decrypt
1. Execute malware
2. Avoid anti-analysis
3. Remove obfuscated code
4. Extract handler BBs of received data
5. Identify crypto-algorithms
![Page 34: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/34.jpg)
Criteria for crypto-algorithm
34
Is fuzzy hash per BB useful for
Identify crypto-algorithms?
• Comparing per BB can not be maintained the uniqueness as a signature• There are many similar insns, many false positives
• Feature does not come out as anti-analysis routines
• Compare the whole point referring received data• Combine their fuzzy hash, calculate LCS
![Page 35: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/35.jpg)
Experiments
35
Experiments of crypto-algorithms
identification using PANDEMONIUM
• Experiment A: Obfuscated sample program
• Experiment B: Real-world malware
![Page 36: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/36.jpg)
Experiment A
36
Analysis of obfuscated sample program
Algorithm Obf A Obf B
MD5
DES
RC4
AES
Blowfish
RSA
A) Insert dead/nop equivalent insns
B) Substitute with equivalent insns/Reorder insns≒ under-tainting
Receive packet, decrypt it(by Crypto++)
![Page 37: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/37.jpg)
Experiment B (1/3)
37
Analysis of real-world malware
• Dyre sample• 999bc5e16312db6abff5f6c9e54c546f• b44634d90a9ff2ed8a9d0304c11bf612• dd207384b31d118745ebc83203a4b04a• B44634d90a9ff2ed8a9d0304c11bf612• 999bc5e16312db6abff5f6c9e54c546f
• Anti-analysis using PEB.NumberOfProcessors
•
![Page 38: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/38.jpg)
Experiment B (2/3)
38
Analysis of real-world malware
• KINS(ZeusVM) sample• eee1bdb8d4ad98cce0031ed6ca43274a
• 84826d5e65987c131a80b1a3aa53ce17
• a2a7d4f75fc263648824facb0757a3c7
• Obfuscation by original code virtualizer• Ex. nop(0x90) is represented as 0x32, 0x26, 0xF3
• Use
![Page 39: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/39.jpg)
Experiment B (3/3)
39
Analysis of real-world malware
Malware Detection ratio algorithm Cause
Dyre 4/5 RSA
KINS 0/3 RC4 VM
• PANDEMONIUM could avoid anti-analysis of Dyre
• Taint tag might have not been propagated• Might've gone a point to be analyzed by the optimization
• LLVM is not suitable for analyzing modern code virtualizer• Themida, ZeusVM, ……
![Page 40: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/40.jpg)
Consideration
• Is LLVM suitable for analyzing malware?• LLVM doesn't try to operate carry flags very much
• If the implementation improved, there might appear more features of algorithms
• Or detection rate will vary depending on the type of encryption algorithm?• Varies among implementation
• Can not be affirmed for now at criteria such as whether the Feistel structure or SPN structure
• PANDEMONIUM was compared by connecting the fuzzy hash of BBs• It may be necessary to weight the massive block
40
![Page 41: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/41.jpg)
Task
• Extract encryption keys
• Analyze unknown algorithms• Should we focus on the density and the data length of the input and
output of function?
• Analyze code virtualizer• Should we implement optimization pass?
41
We need analysis platform can follow evolution of malware
![Page 42: PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynamic Binary Instrumentation and Fuzzy Hashing by Yuma Kurogome - CODE BLUE 2015](https://reader031.fdocuments.us/reader031/viewer/2022021919/587a0cd01a28ab01268b7097/html5/thumbnails/42.jpg)
Summary
• Malware utilize many cryptographic algorithms• To conceal messages and configurations
• DBI(Dynamic Binary Instrumentation)• Dynamic analysis on PANDA(QEMU)• Translate x86 code to LLVM IR(Intermediate representation) per
BB(Basic Block)• Remove obfuscated code by optimization
• Fuzzy hash based pattern matching• Detect and avoid anti dynamic analysis code• Identify cryptographic algorithms from the similarity of handling
received data
42
One entry, one exit