Download - Identifying Web Attacks Via Data Analysis

Transcript
Page 1: Identifying Web Attacks Via Data Analysis
Page 2: Identifying Web Attacks Via Data Analysis

Mike Sconzo

@sooshie

R&D at Click Security

Focused on data analysis for security use cases

Interested in machine learning/statistical analysis

NetWitness

ERCOT

Sandia National Labs

Page 3: Identifying Web Attacks Via Data Analysis

● Introduction● How to use basic log information to detect

different attack types○ Drive-by○ SQL Injection

● Closing

Page 4: Identifying Web Attacks Via Data Analysis

● Python○ IPython○ pandas○ numpy○ matplotlib○ scikit learn

● Bro● Google● sqlmap● JBroFuzz● sqlparse

Page 5: Identifying Web Attacks Via Data Analysis

● Gather data● Clean up data● Explore data● Select/create features (numeric only)*● Run machine learning algorithm*● Analyze results

*optional

Page 6: Identifying Web Attacks Via Data Analysis
Page 7: Identifying Web Attacks Via Data Analysis

Is it possible to find clients being exploited by various exploit kits by just looking at traffic patterns?

● Gather data● Clean up data● Explore data● Analyze results

Page 8: Identifying Web Attacks Via Data Analysis
Page 9: Identifying Web Attacks Via Data Analysis

● 21GB of Network Traffic● 7600 Samples● 687627 Files● 807537 HTTP Requests

Page 10: Identifying Web Attacks Via Data Analysis

*MHR will be used as our ground truth

Page 11: Identifying Web Attacks Via Data Analysis
Page 12: Identifying Web Attacks Via Data Analysis
Page 13: Identifying Web Attacks Via Data Analysis
Page 14: Identifying Web Attacks Via Data Analysis
Page 15: Identifying Web Attacks Via Data Analysis
Page 16: Identifying Web Attacks Via Data Analysis
Page 17: Identifying Web Attacks Via Data Analysis
Page 18: Identifying Web Attacks Via Data Analysis
Page 19: Identifying Web Attacks Via Data Analysis
Page 20: Identifying Web Attacks Via Data Analysis
Page 21: Identifying Web Attacks Via Data Analysis
Page 22: Identifying Web Attacks Via Data Analysis
Page 23: Identifying Web Attacks Via Data Analysis

Is it possible to used supervised learning (classification) to detect strings that are likely SQL Injection?● Gather data● Explore data● Clean up data● Transform data● Select/create features (numeric only)● Run machine learning algorithm● Analyze results

Page 24: Identifying Web Attacks Via Data Analysis
Page 25: Identifying Web Attacks Via Data Analysis
Page 26: Identifying Web Attacks Via Data Analysis
Page 27: Identifying Web Attacks Via Data Analysis
Page 28: Identifying Web Attacks Via Data Analysis
Page 29: Identifying Web Attacks Via Data Analysis
Page 30: Identifying Web Attacks Via Data Analysis

*Transform the data into a form that might give better insight than a signature

Page 31: Identifying Web Attacks Via Data Analysis
Page 32: Identifying Web Attacks Via Data Analysis

● Strings are great, but patterns might be better● Extract patterns from the strings● N-Grams!!!

Page 33: Identifying Web Attacks Via Data Analysis
Page 34: Identifying Web Attacks Via Data Analysis
Page 35: Identifying Web Attacks Via Data Analysis
Page 36: Identifying Web Attacks Via Data Analysis
Page 37: Identifying Web Attacks Via Data Analysis
Page 38: Identifying Web Attacks Via Data Analysis
Page 39: Identifying Web Attacks Via Data Analysis
Page 40: Identifying Web Attacks Via Data Analysis
Page 41: Identifying Web Attacks Via Data Analysis
Page 42: Identifying Web Attacks Via Data Analysis
Page 43: Identifying Web Attacks Via Data Analysis
Page 44: Identifying Web Attacks Via Data Analysis
Page 45: Identifying Web Attacks Via Data Analysis

● It’s possible to make quality decisions/find interesting activity using data

● The more data you have the more accurate your predictions can be

● Gathering (the right) data for the use case is important● Cleaning the data takes a lot of effort, but it’s necessary● Unfortunately none of this is a silver bullet, but it can help point you

in the right direction(s)● None of this is magic, you can do it too!

Page 46: Identifying Web Attacks Via Data Analysis

http://clicksecurity.github.io/data_hacking/