Ariu - Workshop on Multiple Classifier System - 2011

24
 A modular architecture for the analysis of HTTP payloads based on Multiple Classiers Davide Ariu [email protected]  Giorgio Giacinto [email protected]  Department of Electric and Electronic Engineering University of Cagliari Pattern Recog nition and App lications Group  http://prag.diee.unica.it Group This research was sponsored by the  Autonomous Region of Sardinia through a grant  financed with the ”Sardinia PO FSE 20072013”  funds and provided according to the L.R. 7/2007 Napoli, 17 Giugno 2011

Transcript of Ariu - Workshop on Multiple Classifier System - 2011

Page 1: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 1/24

 A modular architecture for the

analysis of HTTP payloads based 

on Multiple ClassifiersDavide Ariu

[email protected] 

Giorgio Giacinto

[email protected] 

Department of Electric and

Electronic Engineering University of Cagliari 

Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

This research was sponsored by the

 Autonomous Region of Sardinia through a grant 

 financed with the ”Sardinia PO FSE 2007‐2013”  funds and provided according to the L.R. 7/2007 

Napoli, 17 Giugno 2011

Page 2: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 2/24

Outline

•  Motivations•  The proposed system•  Experimental Setup and Results•  Conclusions

2Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 3: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 3/24

The objective

Design of an anomaly based  

Intrusion Detection System  

for the protection of

 Web Servers and Applications.

The HTTP traffic toward the web

servers is inspected by a

 multiple classifier system .

3Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 4: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 4/24

 Why Web Applications? 

4Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 5: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 5/24

 Why Anomaly Detection?

5Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 6: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 6/24

 A legitimate Payload...

GET /pra/ita/home.php HTTP/1.1

Host: prag.diee.unica.it

 Accept: text/*, text/html

User-Agent: Mozilla/4.0

6Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 7: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 7/24

 A legitimate Payload...

GET /pra/ita/home.php HTTP/1.1

Host: prag.diee.unica.it

 Accept: text/*, text/html

User-Agent: Mozilla/4.0

7Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Request Line

Page 8: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 8/24

 A legitimate Payload...

GET /pra/ita/home.php HTTP/1.1

Host: prag.diee.unica.it

 Accept: text/*, text/html

User-Agent: Mozilla/4.0

8Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Request Line

Request Headers

Page 9: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 9/24

...and some attacks

•  Long Request Buffer OverflowHEAD / aaaaaaa…aaaaaaaaaaaa

•  URL Decoding Error GET /d/winnt/sys32/cmd.exe?/c+dir HTTP/1.0

Host: www

Connection: close

9Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 10: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 10/24

 Why Payload Analysis?

•  Detection of Web-based attacks basedon the

 –  Analysis of the Request-Line•  Allows detecting only attacks that exploit

input-validation flowse.g. Spectrogram ([Song,2009]), HMM-Web

([Corona,2009]) 

 –  HTTP Payload Analysis•  Takes into account the whole HTTP-request,

and thus it can (in principle) detect anykind of attack 

10Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 11: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 11/24

SOA - Payload Analysis

•  Payl [Wang,2004] –  n-grams to represent byte statistics

•  McPAD [Perdisci,2009] –  Ensemble of one-class SVM trained on ν-grams

•  Spectrogram [Wang,2009] –  Ensemble of Markov Chains to analyze the request-Line

•  HMMPayl [Ariu,2011] – 

Ensemble of HMM to analyze sequences of bytes fromthe whole payload

 None of the above techniques

represented the structure of the payload 

11Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 12: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 12/24

The proposed system Basic Idea 

•  We propose to take into account thestructure of HTTP payloads

 – For each line of the payload, an

ensemble of HMM is used to model the

sequences of bytes.

 – The final decision is obtained byusing the HMM outputs as features.

The payload is thus classified by a

one-class classifier trained on theoutputs of the HMM ensembles. 

12Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 13: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 13/24

The proposed system  A scheme

13Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

HMM EnsembleRequest‐Line

HMM EnsembleUser‐Agent

HMM EnsembleHost

HMM EnsembleAccept‐Encoding

HMM EnsembleAccept‐Language

0.62

‐1

0.53

0.34

0.49

One‐Class

Classifier

Output Score 

or

Class‐Label 

IDS

GET /pra/index.php HTTP/1.1Host: prag.diee.unica.itUser-Agent: Mozilla/5.0 Accept-Encoding: gzip, deflate

HTTP Payload

Page 14: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 14/24

 Missing Features

•  Each request typically does notcontain all the headers

 –  Training phase: the value of the

feature related to a missing header has

been set to the average value

 –  Testing phase: the value of the featurerelated to a missing header has been

set to -1 

14Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 15: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 15/24

Experimental Setup - 1

•  2 Datasets of Real legitimate 

traffic

 –  DIEE, collected at the University of

Cagliari

 –  GT, collected at Georgia Tech

15Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 16: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 16/24

Experimental Setup - 2

•  3 Datasets of Real Attacks

 – Generic, 66 Attacks

 – Shell-code, 11 Attacks  – XSS-SQL Injection,38 Attacks

•  Training: 1 day of traffic •  Test: the remaining traffic plusattacks – K-fold CV

16

Page 17: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 17/24

Experimental Setup - 3

•  4 One-class classification algorithmswith default setting of parameters

 –  Gauss - Gaussian distribution 

 –  Mog – Mixture of Gaussians 

 –  Parzen – Parzen density estimator 

 –  SVM – SVM with RBF Kernel

•  Performance evaluated using the Partial

 AUC  –  Computed in the FP range [0,0.1] –  Normalized dividing by 0.1

17Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 18: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 18/24

Experimental Results

Partial AUC – DIEE Dataset 

18Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 19: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 19/24

Experimental Results Multiple HMM – DIEE Dataset – Shellcode Attacks

19Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 20: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 20/24

Experimental Results

Partial AUC – GT Dataset 

20Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 21: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 21/24

Experimental Results

Comparison with similar IDS 

21Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 22: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 22/24

Computational Cost

22Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 23: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 23/24

Conclusions

•  We proposed an anomaly based IDS for the

 protection of Web-Servers and Web-

Applications

•  We exploited the MCS paradigm

 –  To analyze the structure of the HTTP payload 

 –  By combining the outputs through a One-class

classifier

•  Compared to similar systems, our propoal – Provides high performance in attack detection

 –  Is fast

23Pattern Recognition and Applications Group 

http://prag.diee.unica.itGroup 

Page 24: Ariu - Workshop on Multiple Classifier System - 2011

8/4/2019 Ariu - Workshop on Multiple Classifier System - 2011

http://slidepdf.com/reader/full/ariu-workshop-on-multiple-classifier-system-2011 24/24

Thank You!