‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

32
June 11, 2022 ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications Rachit Mathur Research Scientist McAfee 18th EICAR Annual Conference 9 th – 12 th May, 2009 Berlin, Germany

description

‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications. Rachit Mathur Research Scientist McAfee. September 5, 2014. 18th EICAR Annual Conference 9 th – 12 th May, 2009 Berlin, Germany. Agenda. Introduction & Malware Growth. Supervised-Automation. - PowerPoint PPT Presentation

Transcript of ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Page 1: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

April 21, 2023

‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Rachit Mathur Research ScientistMcAfee

18th EICAR Annual Conference9th – 12th May, 2009Berlin, Germany

Page 2: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Agenda

Introduction & Malware Growth

Supervised-Automation

Compare With Metamorphism

Real-World Examples

Detection Challenges

Conclusions & Future work

Questions

Page 3: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Malware Growth – All known samples

+180%

Page 4: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Malware Growth – Families vs Variants

Page 5: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Rogue AV Unique Binaries Discovered

Page 6: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Sample Count Explosion

• Lots of variants per family• New variants released even before a signature for

previous ones gets released• Money-motivated organized malware gangs

– ‘Professional products’– Pose serious detection challenges

• Difficult to anticipate changes• Short-term per family proactive detection is minimum requirement

– Use bleeding-edge technology• Conficker – crypto algorithms• MBR rootkit – stealth techniques• To evade detection is the primary motive

Page 7: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Morphing Malware

• Not the traditional poly or metamorphics• Do not carry the mutator• Delivered through the cloud (server-side)

– Drive-by downloads, social engineering, self-updating malware– Binaries change often

• Now adopted by all– Backdoor, PWS, AdClicker, Proxy, Worms etc

• Morphing services– Tibs-Packed: Storm worm, downloader, uploder, spam-bot,

backdoors etc.– FakeAV looking downloaders, backdoors, worms

• Human supervised automated variant generation system

Page 8: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Supervised-Automation

• Supervised Automation (SA) is semi-automated method of generation of malware variants with sporadic human intervention

• Loosely related to the concept of metamorphism

• Not based off of any particular malware family

Page 9: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Supervised Automation

Malicious binary & info

Release-to-world

Select and apply morphing

Select and apply encryption

Black-Box signature extraction

B

Info

Human

Info

E(B)

M(E(B))

Info

Loop-back to re-encrypt

• ADD

• SUB

• XOR

• ROT

• RC4

• Dead Code Insertion

• Junk Code Insertion

• CFG Obfuscation

• Instruction Substitution

• Decryption Key Obfuscation

• Geometric Fuzzyfication

Page 10: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Supervised-Automation

• Generate any number of new variants at the desired frequency

• Motive is to evade detection and not ‘blindly’ generate variants

• Different pattern of operation observed in Tibs-Packed, FakeAV, GamePWS trojans

Page 11: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

SA vs. Metamorphism

• Generally speaking, virus detection is undecidable

• Solutions for specific sub cases have been proposed

• Let us see what existing results from comparable technology apply to SA

• Purely automatic variant generation i.e. the concept of metamorphism is studied

Page 12: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

SA vs. Metamorphism

• Do not carry the engine• Transformation logic is not self-contained

• Transformation rules not constant

• No feed-back loop• Transformations not limited

• Anti debugging, anti disassembly, anti emulation : anti analyses

Locate own code

Decode

Analyze

Transform

Metamorphic engine

Page 13: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Normalization based approach

• Transformation rules modelled as Term Rewriting Systems (TRS) and related to formal grammars

• Proving equivalence between two programs w.r.t. a rewriting system reduces to the famous word problem

– Undecidable in general– Unless TRS is confluent and terminating– Some approximation based approaches

mov edi, 0x04

mov eax, 0x04 push eax

push eax mov eax, 0x04

push 0x04

push ecx mov ecx, 0x04 mov edi, ecx pop ecx

push eaxeax not live

unconditional

eax not live

Page 14: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Normalization based approach

RS3RS2

RS1

Time

• Multiple TRS bad news for some solutions•Q: Do multiple TRS really make a difference?

•Same worst case for a ‘well-designed’ system•But multiple TRS does make things worse

Page 15: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Approaches

• Approaches that are agnostic of rule systems can be useful against such systems

• Smart byte-based detection schemes

• Normalization based on general optimization techniques and program semantics based detection methods

• Behaviour based detection may be useful today

• Emulation based techniques have been proposed earlier to identify detectable behaviours but emulation has a host of well known problems

Page 16: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – Storm worm

start of encrypted code

end of encrypted code

Fake call returns -1.

Add , rotate

• Locate the start address of encrypted data and size/end of the data • Calculate key(s): key[i] • Apply key(s)• Transfer control to decrypted code

Page 17: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – Storm worm

start of encrypted code

end of encrypted code

Fake call returns -1.

Add , rotate

Page 18: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – Storm worm

start of encrypted code

end of encrypted code

Fake call returns -1.

Add , rotate

Page 19: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – Storm worm

Base Variant (BV)

Algorithm BAlgorithm A Algorithm N…..

EBV1 EBV2 EBV3

M1 M2 Mn M1 M2 Mn M1 M2 Mn……

M11K

M12

M1n

K

Algorithm C

K

K

K

M11

M12

M1n

M11

M11

M1n

K

K

K

K

Day 1

M21K

M22

M2n

K

K

Day 2 …. Day m

K

…K

K

M11

M12

M1n

Day m+1 Day m+2 Day n

M11

M11

M1n

K

K

K

Day n+1 Day n+2 Day o

…..

…..

• High, medium and low frequency changes

Page 20: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – DNSChanger

• Uses obfuscated calls

Rules can be conditional

Possible call targets

Page 21: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – PWS dll

• Rules change often• Constructs strings

HBXYXND-0109-NEW

Page 22: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – PWS dll

• Rules change often• Constructs strings

WM_HOOKEX_RK

Page 23: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – PWS dll

• Rules change often• Constructs strings

Explorer.exe

Page 24: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Example – PWS dll

• Rules change often• Constructs strings

act=getpos&account=%s

Page 25: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

• junk code• variable renaming• register liveness• second one is reversed

Example – FakeAV

Page 26: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Detection Challenges

• Virus authors want to evade detection, and keep undetected once a machine is compromised

• AV update should detect the ‘current’ vairant – somewhat ‘proactive’

• Able to detect all automatically generated variants up till the next human based update

• Resistant to non-functional changes

Page 27: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Signatures

• Goal is to find ‘enough’ evidence to detect and classify a file for practical purposes such that it will not generate any false positives

– Generic– Reliable : No falses– “my virus botnet, attack ms08-067 ping”

Page 28: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Signatures

• Simple byte sequence based not useful– Hash based– Detection worthy strings– Detection worthy code sequence

• Multiple sets of wildcard based byte sequences at various locations that remain constant

• Emulation• Decryption or cryptanalysis based

– Presence of a technique can yield itself to detection

• Geometry based• Combination provides the right balance

Page 29: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Conclusions & Future Work

• Stakes are getting bigger with increasingly critical, sensitive, high-value information at risk

• Adoption of cutting-edge research concepts and innovation skills by virus authors

• More automation and more understanding of ‘correct’ transformation techniques is expected

• Interesting to formalize some results in the realm of SA based malware

• Detections solutions which are agnostic of rewrite systems need to be investigated.

• It will also be interesting to see how behaviour evolution materializes in reality and any forward looking research around that is very relevant

Page 30: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

References

• Bruschi, D., Martignoni, L., & Monga, M. (2006). Detecting Self-mutating Malware Using Control-Flow Graph Matching. Lecture Notes in Computer Science , 4064/2006 (Detection of Intrusions and Malware & Vulnerability Assessment), 129-143.

• Bruschi, D., Martignoni, L., & Monga, M. (2006). Using code normalization for fighting self-mutating malware. International Symposium on Secure Software Engineering. Washington, DC, USA: IEEE.

• Chess, D. M., & White, S. R. (2000). An undetectable computer virus. In Proceedings of Virus Bulletin Conference. • Christodorescu, M., & Jha, S. (2003). Static analysis of executables to detect malicious patterns. SSYM'03: Proceedings of the 12th conference on USENIX

Security Symposium (pp. 12 - 30). USENIX Association.• Christodorescu, M., & Jha, S. (2004). Testing malware detectors. ACM SIGSOFT Software Engineering Notes , 29 (4), 34 - 44.• Christodorescu, M., Jha, S., Seshia, S. A., Song, D., & Bryant, R. E. (2005). Semantics-Aware Malware Detection. IEEE Symposium on Security and

Privacy (pp. 32 - 46). ACM Press.• Filiol, E. (2006). Malware Pattern Scanning Schemes Secure Against Black-box Analysis. Journal in Computer Virology , 35-50.• Filiol, E. (2007). Metamorphism, Formal Grammars and Undecidable Code Mutation. International Journal of Computer Science .• Filiol, E., & Josse, S. (2007). A statistical model for undecidable viral detection. Journal in Computer Virology , 3, 65-74.• Filiol, E., Jacob, G., & Liard, M. L. (2006). Evaluation methodology and theoretical model for antiviral behavioural detection strategies. Journal in Computer

Virology , 23-37.• Kapoor, A., & Mathur, R. (2008, June). STRIKE ME DOWN, AND I SHALL BECOME MORE POWERFUL! VIRUS BULLETIN , pp. 8-10.• Lakhotia, A., Kapoor, A., & Kumar, E. U. (2005, January). Are metamorphic viruses really invincible? - part II. Virus Bulletin , pp. 9-12.• Mathur, R. (2006, December). Normalizing Metamorphic Malware using Term-Rewriting. M.S. Thesis . University of Louisiana at Lafayette.• Mathur, R., & Kapoor, A. (2007, December). Exploring The Evolutionary Patterns Of Tibs-Packed Executables. Virus Bulletin , pp. 6-9.• Soeder, D., & Permeh, R. (2005). BootRoot. Retrieved from eEye: http://research.eeye.com/html/tools/RT20060801-7.html• Szor, P., & Ferrie, P. (2001). Hunting for metamorphic. 11th International Virus Bulletin Conference. • Tan, X. (2007). Anti-unpack Tricks in Malicious Code. AVAR. Seoul.• Walenstein, A., Mathur, R., Chouchane, M. R., & Lakhotia, A. (2008). Constructing malware normalizers using term rewriting. Journal in Computer Virology ,

307-322.• Walenstein, A., Mathur, R., Chouchane, M. R., & Lakhotia, A. (2007). The Design Space of Metamorphic Malware. Proceedings of the 2nd International

Conference on Information Warfare. Monterey, CA, U.S.A.• Webster, M., & Malcolm, G. (2008, July). Detection of metamorphic and virtualization-based malware using algebraic specification. Journal in Computer

Virology .

Page 31: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications

Thank You! (Danke!)

Suggestions & Questions:Email: [email protected]

Page 32: ‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications