Chasing web-based malware
description
Transcript of Chasing web-based malware
![Page 2: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/2.jpg)
Who am I?
• Lecturer in Computer Security at the University of Birmingham, UK
• Member of the founding team of Lastline, Inc.
• Research interests: – Malware analysis – Vulnerability analysis
![Page 3: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/3.jpg)
WEB MALWARE
![Page 4: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/4.jpg)
Web-‐based malware
evil.js
GET /
<iframe>
![Page 5: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/5.jpg)
Malicious code
![Page 6: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/6.jpg)
Exploit
![Page 7: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/7.jpg)
Social Engineering
![Page 8: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/8.jpg)
Not really LinkedIn
Social Malware
![Page 9: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/9.jpg)
Blackhat SEO
![Page 10: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/10.jpg)
Watering Hole AUacks
• SomeVmes it is difficult to exploit the target of an aUack directly – Instead compromise a site that
is likely to be visited by the target
• Council on foreign relaVons → governmental officials
• Unaligned Chinese news site → Chinese dissidents
• iPhone dev web site → developers at Apple, Facebook, TwiUer, etc.
• NaVon Journal web site → PoliVcal insiders in Washington
![Page 11: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/11.jpg)
CHASING WEB MALWARE Oracles, Filters, Seeders, AnV Evasions
![Page 12: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/12.jpg)
Oracle
• EssenVally, a classificaVon algorithm for web content – Input: web page – Output: classificaVon (malicious or benign)
• In pracVce, it is useful to extract and provide users with evidence to support classificaVon – Exploit detecVon – DeobfuscaVon results – Anything that helps forensics, really
![Page 13: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/13.jpg)
Oracle approaches
• Nowadays, most oracles are dynamic analysis systems – We care about the behavior of a sample/web page/document
• Run a sample/visit a web page inside an instrumented environment and monitor its behavior
• Bypass all obfuscaVon/feasibility concerns associated with staVc analysis
• Opens up a lot of interesVng challenges related to transparency and evasion
![Page 14: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/14.jpg)
Wepawet
• Detec3on and Analysis of Drive-‐by-‐Download ABacks and Malicious JavaScript Code Marco Cova, Christopher Kruegel, Giovanni Vigna in Proceedings of the World Wide Web Conference (WWW), Raleigh, NC, April 2010
• hUp://wepawet.cs.ucsb.edu • By the numbers: – Number of unique IPs that submiUed to Wepawet: 141,463
– Number of pages visited and analyzed by Wepawet: 67,424,459
– Number of malicious pages idenVfied as malicious: 2,239,335
![Page 15: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/15.jpg)
Wepawet Features
• Exploit preparaVon – Number of bytes allocated
(heap spraying) – Number of likely shellcode
strings
• Exploit aUempt – Number of instanVated
plugins and AcVveX controls
– Values of aUributes and parameters in method calls
– Sequences of method calls
• RedirecVons and cloaking – Number and target of
redirecVons – Browser personality-‐ and
history-‐based differences
• ObfuscaVon – String definiVons/uses – Number of dynamic code
execuVons – Length of dynamically-‐
executed code
![Page 16: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/16.jpg)
Filter
• If everything goes well, amer a while we will have more samples/pages than you can analyze in-‐depth with your oracle
• Analysis Vme ranges from a few seconds to a couple of minutes – Oracle actually runs the sample – SomeVmes mulVple Vmes (anV-‐evasion techniques)
• Challenge: how do we scale?
![Page 17: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/17.jpg)
StaVc filtering
• Quick idenVficaVon of drive-‐by-‐download web pages – Each web page is deemed likely benign or likely malicious
• Basis for the classificaVon is a set of staVc features
• Necessarily more imprecise than oracle – We only worry about not having false negaVves – Very tolerant with false posiVves (consequence: more work for our oracle)
![Page 18: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/18.jpg)
Prophiler
• Filter for malicious web pages • Prophiler: a Fast Filter for the Large-‐Scale Detec3on of Malicious Web Pages, Davide Canali, Marco Cova, Christopher Kruegel, Giovanni Vigna in Proceedings of the Interna=onal World Wide Web Conference (WWW), 2011
![Page 19: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/19.jpg)
StaVc features
• We define three classes of features (77 in total) – HTML (19)
• source: web page content – JavaScript (25)
• source: web page content – URL and host-‐based (33)
• source: page URL and URLs included in the content
• One machine learning model for each feature class
![Page 20: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/20.jpg)
Example features
HTML features • iframe tags, hidden elements, elements with a small area, script elements, embed and object tags, scripts with a wrong filename extension, out-‐of-‐place elements, included URLs, scripVng content percentage, whitespace percentage, meta refresh tags, double HTML documents, …
![Page 21: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/21.jpg)
Matches
<div style="display:none"> <iframe src="http://biozavr.ru:8080/index.php" width=104 height=251 > </iframe></div>
<body><div id="DivID"> <script src='a2.jpg'></script> <script src='b.jpg'></script> <script src='url.jpg'></script> <script src='c.jpg'></script> <script src='d.jpg'></script> <script src='e.jpg'></script> <script src='f.jpg'></script>"</body>
![Page 22: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/22.jpg)
EvaluaVon
• Large-‐scale evaluaVon of Prophiler
• 60 days of crawling + analysis
• 18,939,908 unlabeled pages
• 14.3% of pages flagged as suspicious and submiUed to Wepawet (13.7% FP)
• 85.7% load reducVon on Wepawet = saving more than 400 days of analysis!
![Page 23: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/23.jpg)
Smart crawler
• How do we seed our oracle + filter • Obvious idea: crawling – Problem: toxicity of regular crawling is preUy low
– ObservaVon: crawling only as good as the iniVal seeds
• Challenge: can we find beUer seeds?
![Page 24: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/24.jpg)
EvilSeed
• Guided search approach to increase toxicity of pages that are crawled
• Inputs: malicious web pages found in the past
• Output: set of (more likely malicious) web pages
• EVILSEED: A Guided Approach to Finding Malicious Web Pages, Luca Invernizzi, Stefano BenvenuV, Paolo Milani, Marco Cova, Christopher Kruegel, Giovanni Vigna, in Proceedings of the IEEE Symposium on Security and Privacy, 2012
![Page 25: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/25.jpg)
Gadgets
![Page 26: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/26.jpg)
Gadgets
• Links gadget (malware hub) • Content dorks gadget • SEO gadget • Domain registraVon gadget
• DNS queries gadget
![Page 27: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/27.jpg)
AnV evasion
• At this point of the story, the bad guys will acVvely try to evade your system
• Lots of effort in designing evasion techniques – Analysis environment detecVon – User detecVon – Stalling
• Challenge: how do we detect if we are being evaded?
![Page 28: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/28.jpg)
Revolver
• AssumpVon: aUackers are likely to take exisVng malicious samples/web pages and enhance them to add evasive code
• Idea: detect similar samples that are classified differently by the oracle
• Revolver: An Automated Approach to the Detec3on of Evasive Web-‐based Malware A. Kapravelos, Y. Shoshitaishvili, M. Cova, C. Kruegel, G. Vigna in Proceedings of the USENIX Security Symposium Washington, D.C. August 2013
![Page 29: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/29.jpg)
Revolver
IF
VAR <= NUM
…
Oracle Web
IF
VAR <= NUM
…
Similarity computaVon {bi, mj}
Malicious evoluVon Data-‐dependency JavaScript infecVons Evasions
Pages ASTs Candidate pairs
…
…
![Page 30: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/30.jpg)
Revolver
![Page 31: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/31.jpg)
Terms Extractor
Malicious Pages
Feature Extractor
Public Portal
Crawler
C&C Site
Honeyclient Honeyclient Honeyclient
Wepawet
Clou
d
EvilSeed
hUp://www.easymoney.com hUp://cheapfarma.ru
hUp://rateyourcar.com hUp://nudecelebriVes.it
Prophiler
Benign Pages
Possibly Malicious Pages
Anubis
Exploit Site
Malicious Pages
Benign Pages
Threat Intel Block
![Page 32: Chasing web-based malware](https://reader034.fdocuments.us/reader034/viewer/2022051820/553bb815550346bd418b476f/html5/thumbnails/32.jpg)
Challenges
• Evasions – DetecVon – Bypass (when possible)
• Targeted aUacks • Defense/offense imbalance