Enhancing Scalability in Anomaly-based Email Spam Filtering - CEAS 2011
Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM...
Transcript of Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM...
![Page 1: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/1.jpg)
UNIVERSITÄT
Laboratory for Dependable Distributed Systems
MANNHEIM
Towards Proactive SPAM FilteringDIMVA 2009
![Page 2: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/2.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Survey
• Motivation
• Sandnet Setup
• Template Creation
• Preliminary Results
• Summary & Future Work
![Page 3: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/3.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Motivation
• SPAM is unwanted
• Why templates for filtering:
• Templates more precise than current methods? (Bayes Filter, Reputation based, ...)
• Templates send to Bots are encrypted
• Retrieve template from memory of running bot - too complex?
![Page 4: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/4.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Example Template 1
In this example the body is fixed
![Page 5: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/5.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Example Template 2
Quelle: www.marshal8e6.com
Example: Command {file "body.html", quoted printable} tells the bot to substitute the body.html file
Xarvester Botnet
![Page 6: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/6.jpg)
Sandnet SetupRunning Spam Bots
![Page 7: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/7.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Sandnet 1
![Page 8: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/8.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Sandnet 2
• Spam Email are collected at the gateway (mbox)
• Filtering of malicious traffic + rate limit
• How to handle test emails send by bots?
• Currently blocked
• Our current setup runs the bots only for a limited time
![Page 9: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/9.jpg)
Generating TemplatesThe Algorithm
![Page 10: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/10.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Template Creation 1
• The Template Creation Algorithm:
• Take first email as starting template
• Sort emails according to their length
• Take next email as comparing template
• Common Substring Extraction
• Add emails to the template as long as threshold is not exceeded
![Page 11: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/11.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Template Creation 2
![Page 12: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/12.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Example Template 1
Only X-Mailer Changes
Generated from 1175 emails
![Page 13: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/13.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Example Template 2
Only Subject and X-Mail change
Generated from 4741 emails
![Page 14: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/14.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Example Template 3
Generated from 172 emailsMore complex due to word mutations in the emails
![Page 15: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/15.jpg)
Preliminary ResultsEuro Dice Casino Case Study
![Page 16: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/16.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Euro Dice Casino 1
• We generated a Template from 71 emails all collected during a single day in October 2008
![Page 17: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/17.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Euro Dice Casino 2
• We collected SPAM emails advertising the casino during June 2008 till April 2009
• A total of 493 emails advertising the Euro Dice Casino were collected at our spamtraps (some free email accounts)
• Checking against our previously generated template revealed a detection rate of only 5.3%
• All matches are emails received at the spamtraps during October 2008
![Page 18: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/18.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Euro Dice Casino 3• We added a randomly chosen email from the
spamtrap emails to our template generation process
![Page 19: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/19.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Euro Dice Casino 4
• Adding a single slightly different email resulted in a detection rate of 26% (previously 5.3%)
• We now match emails of this campaign ranging from September to November 2008
• All that changed is the URL
• eurocasinokg.com
• eurocasino([A-Za-z]){2,2}.com
![Page 20: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/20.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Euro Dice Casino 5• Adding another email:
![Page 21: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/21.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Euro Dice Casino 6• Adding another email raises the detection rate to
99%
• Again only the URL changes:
• eurocasino([A-Za-z]){2,2}.com
• ([\.A-Za-z]){0,16}
• The number of distinct emails of a campaign determines the quality of a template
• In this case a total of 3 emails suffices for a 99% detection rate of the email campaign
![Page 22: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/22.jpg)
Summary...and future work
![Page 23: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/23.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Summary
• Sandnet (run bots periodically)
• Offline template generation
• Common Substring Algorithm
• First results are promising
![Page 24: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/24.jpg)
Jan Göbel • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄTMANNHEIM
Future Work
• Rebuild the Sandnet to run bots endlessly
• Construct templates while collecting the SPAM from the running bots (realtime)
• Build a Mail-Client Plugin for template filtering
• Evaluate the approach
![Page 25: Towards Proactive SPAM Filtering - COnnecting REpositories · 2017-11-21 · Towards Proactive SPAM Filtering DIMVA 2009. Jan Göbel • Pi1 - Laboratory for Dependable Distributed](https://reader036.fdocuments.us/reader036/viewer/2022062603/5f729d45e551c34a0011cc2a/html5/thumbnails/25.jpg)
UNIVERSITÄTMANNHEIM
Jan Göbelhttp://pi1.informatik.uni-mannheim.de/[email protected]
Pi1 - Laboratory for Dependable Distributed Systems
Questions ?