Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online...

21
Platforms and Techniques for Online Platforms and Techniques for Online Traffic Identification Traffic Identification Cost Cost-TMA Samos Meeting 22 TMA Samos Meeting 22-23 September 2008 23 September 2008 COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 1 Alberto Alberto Dainotti Dainotti [email protected] [email protected] COMICS Research Group COMICS Research Group University of Napoli “Federico II” University of Napoli “Federico II”

Transcript of Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online...

Page 1: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Platforms and Techniques for Online Platforms and Techniques for Online

Traffic IdentificationTraffic Identification

CostCost--TMA Samos Meeting 22TMA Samos Meeting 22--23 September 200823 September 2008

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 1

Alberto Alberto DainottiDainotti

[email protected]@unina.it

COMICS Research GroupCOMICS Research Group

University of Napoli “Federico II”University of Napoli “Federico II”

Page 2: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

People@COMICS

� COMICS (COMputersCOMputers forfor InteractionInteraction and and CommunicationSCommunicationS )

� Around 30 people in the group

� 2 laboratories:

�UoN/DIS

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 2

• @ University of Napoli

�CINI/ITEM

• a research lab of the Italian University Consortium in Computer

Science & Engineering

� Funding mainly from EU, Industry, with some money from

national and local government

Page 3: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Projects@COMICS

� EU Projects

�OneLab

�OneLab2

�NetQoS

�Content

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 3

�Cost TMA

� Intersection

� National

�Recipe (Robust and Efficient traffic Classification in IP

nEtworks)

Page 4: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

�Research areas:

�Traffic Measurements and Analysis

�Network Monitoring

�QoS in heterogeneous networks

�Traffic Engineering

�Wireless Mesh Networks

Research@Comics

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 4

�Wireless Mesh Networks

�Management and control of network infrastructures

• SLA, SLS, Policy based management

�Security, Reliability and Resilience

�Multimedia services engineering

�…

Page 5: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Network Network MonitoringMonitoring and and MeasurementsMeasurements

Links

Topologies� Topology Discovery

� Intradomain, IP level, Active/Passive, Distributed, …

� Active and passive measurements

� Available Bandwidth

� QoSparameters

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 5

ApplicationsTraffic

� Traffic Characterization (novel applications and malware traffic), Traffic Modeling, Traffic Generation, Traffic Classification

� Anomaly and Worm detection

� New trends: Network neutrality, Network forensics

parameters

� …

http://www.grid.unina.it/Traffic

Page 6: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

TIE: Traffic Identification EngineTIE: Traffic Identification Engine

� TIE is a community-oriented project for traffic classification

� Public web site and first beta announced today @ Cost-

TMA meeting !

http://tie.comics.unina.it

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 6

� Project started in 2007. Collaborations too!

Page 7: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

TIE: Traffic Identification EngineTIE: Traffic Identification Engine

� An open-source software platform working as a multiple

classifier system

� Purpose: to allow the community to work with shared tools

and data to investigate several aspects of traffic

classification

� Offline, Online, historical web reports

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 7

� Easy to add: classification techniques, classification

features, combination strategies

� Programming API and documentation. Users mailing list.

� Anonymized Traces with GT data

� Code to the data

Page 8: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s ComponentsTie’s Components

� Well defined portions of code allow easy modifications and

extensions

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 8

extensions

� Processing revolves around a sessions table. Each

session structure in the table contains

� Status Information

� Flags

� Counters

� Features

Page 9: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: Packet filterTie’s Components: Packet filter

� Based on the pcap* library

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 9

� Input can be either live traffic or a traffic trace

� Can operate packet filtering and validation (e.g.

checksums)

*http://ww.tcpdump.org

Page 10: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: Session BuilderTie’s Components: Session Builder

� Different definitions of sessions are allowed

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 10

� Flows

• <L4Proto, IPsrc, Portsrc, IPdst, Portdst> + timeout

� Biflows

• Same as above but src and dst swappable

• Some heuristics for TCP can be used

� Host

• Under development

Page 11: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: Feature ExtractorTie’s Components: Feature Extractor

� Features

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 11

� Portions of payload

� Pkt/byte count

� PS vector

� IPT vector

� …

� Features can be enabled/disabled at compile time

Page 12: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: Classification Tie’s Components: Classification PluginsPlugins (1/2)(1/2)

� Each plugin implements a specific classification technique

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 12

classification technique

� It operates on a session

� It returns a result that includes a confidence value

� “dummy” plugin source available

typedef struct {

int (* disable)(); int (* enable)();int (* load_signatures)(char *);int (* train)(char *);class_output* (* classify_session)(void *session);int (* dump_statistics)(FILE *);bool (* is_session_classifiable)(void *session);int (* session_sign)(session *, class_output *); char *name;u_int32_t *flags;

} classifier;

Page 13: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: Classification Tie’s Components: Classification PluginsPlugins (2/2)(2/2)

Name Based on Status Contributor

Port L4 Ports Available UNINA (signatures from

CAIDA)

L7 Deep Payload Inspection Available UNINA (signatures/code from

Linux L7-filter)

NBC Lightweight Payload Inspection Under test UNINA

GMM-

PS

Statistical Approach: PS Under test UNINA

HMM Statistical Approach*: PS, IPT Under test UNINA

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 13

� Each plugin

HMM Statistical Approach*: PS, IPT Under test UNINA

FPT Statistical Approach**: PS, IPT Under

devel.

UNIBS

Joint Machine Learning Under

devel.

UNINA-CAIDA

??? ??? ??? YOU ?

*A. Dainotti, W. de Donato, A. Pescapè, P. Salvorossi “Classification of Network Traffic via Packet-Level Hidden Markov Models”, IEEE GLOBECOM 2008

**M. Crotti, F. Gringoli, P. Pelosato, L. Salgarelli, "A Statistical Approach to IP-level classification of network traffic", IEEE ICC 2006

Page 14: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: Decision CombinerTie’s Components: Decision Combiner

� The decision combiner determines the combination

strategy

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 14

strategy

� Which classifiers are invoked

� When classifiers are invoked

� How different results are combined to output a final

decision and confidence value

Page 15: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

Tie’s Components: OutputTie’s Components: Output

� Output format is one, semantics change depending on

session type (flow, biflow) and working mode (offline,

Packet

Filter

SessionBuilder

Feature Extractor

Decision Combiner

Classification

Plugin #1

Classification

Plugin #n

Output

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 15

session type (flow, biflow) and working mode (offline,

realtime, cyclic,…)

� Some perl scripts help in processing the output (e.g.

overall stats, confusion matrix, …)

Page 16: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

ScreenshotsScreenshots

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 16

Page 17: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

ScreenshotsScreenshots

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 17

Page 18: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

ScreenshotsScreenshots

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 18

Page 19: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

ScreenshotsScreenshots

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 19

Page 20: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

ScreenshotsScreenshots

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 20

Page 21: Platforms and Techniques for Online Traffic Identification · Platforms and Techniques for Online Traffic Identification CostCost--TMA Samos Meeting 22 TMA Samos Meeting 22-23 September

THANKS

For Cost-TMA activities write us @

[email protected]

COMICS (COMputer for Interaction and CommunicationS) Research Group – DIS, University of Napoli Federico II 21

[email protected]

Antonio Pescapè

http://www.grid.unina.it/Traffic