Identifying Web Attacks Via Data Analysis

Post on 08-Jun-2015

187 views 0 download

Tags:

description

This presentation will look at detection of SQL injection using Machine Learning as well as profiling web traffic to find misbehaving hosts. The goal is to get beyond "Top N" types of analysis and begin using multiple features to guide us towards interesting traffic. With these techniques multiple log types can be used, everything from web server logs to proxy logs.

Transcript of Identifying Web Attacks Via Data Analysis

Mike Sconzo

@sooshie

R&D at Click Security

Focused on data analysis for security use cases

Interested in machine learning/statistical analysis

NetWitness

ERCOT

Sandia National Labs

● Introduction● How to use basic log information to detect

different attack types○ Drive-by○ SQL Injection

● Closing

● Python○ IPython○ pandas○ numpy○ matplotlib○ scikit learn

● Bro● Google● sqlmap● JBroFuzz● sqlparse

● Gather data● Clean up data● Explore data● Select/create features (numeric only)*● Run machine learning algorithm*● Analyze results

*optional

Is it possible to find clients being exploited by various exploit kits by just looking at traffic patterns?

● Gather data● Clean up data● Explore data● Analyze results

● 21GB of Network Traffic● 7600 Samples● 687627 Files● 807537 HTTP Requests

*MHR will be used as our ground truth

Is it possible to used supervised learning (classification) to detect strings that are likely SQL Injection?● Gather data● Explore data● Clean up data● Transform data● Select/create features (numeric only)● Run machine learning algorithm● Analyze results

*Transform the data into a form that might give better insight than a signature

● Strings are great, but patterns might be better● Extract patterns from the strings● N-Grams!!!

● It’s possible to make quality decisions/find interesting activity using data

● The more data you have the more accurate your predictions can be

● Gathering (the right) data for the use case is important● Cleaning the data takes a lot of effort, but it’s necessary● Unfortunately none of this is a silver bullet, but it can help point you

in the right direction(s)● None of this is magic, you can do it too!

http://clicksecurity.github.io/data_hacking/