Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot...

13
Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005

Transcript of Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot...

Page 1: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

Mining Anomalies in Network-Wide Flow Data

Anukool Lakhina with Mark Crovella and Christophe Diot

NANOG35, Oct 23-25, 2005

Page 2: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

2

My Talk in One Slide

• Goal: A general system to detect & classify traffic anomalies at carrier networks

• Network-wide flow data (eg, via NetFlow) exposes a wide range of anomalies– Both operational & malicious events

• I am here to seek your feedback

Page 3: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

3

Network-Wide Traffic Analysis

• Simultaneously analyze traffic flows across the network; e.g., using the traffic matrix

• Network-Wide data we use: Traffic matrix views for Abilene and Géant at 10 min bins

Page 4: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

4

LA

HSTNATLA

NYC

Power of Network-Wide Analysis

Distributed Attacks easier to detect at the ingress

IPLS

Peak rate: 300Mbps; Attack rate ~ 19Mbps/flow

Page 5: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

5

How do we extract anomalies and normal behavior from noisy, high-dimensional data in a systematic manner?

But, This is Difficult!

Page 6: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

6

The Subspace Method [LCD:SIGCOMM ‘04]

• An approach to separate normal & anomalous network-wide traffic

• Designate temporal patterns most common to all the OD flows as the normal patterns

• Remaining temporal patterns form the anomalous patterns

• Detect anomalies by statistical thresholds on anomalous patterns

Page 7: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

7

An example user anomaly

One Src-Dst Pair Dominates: 32% of B, 20% of P traffic

Cause: Bandwidth Measurement using iperf by SLAC

Page 8: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

8

An example operational anomaly

Multihomed customer CALREN reroutes around outage at LOSA

Page 9: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

9

Summary of Anomaly Types Found [LCD:IMC04]

Alpha

DOS

Scans

FlashEvents

Unknown

False Alarms

Traffic ShiftOutageWormPoint-Multipoint

Page 10: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

10

Automatically Classifying Anomalies [LCD:SIGCOMM05]

• Goal: Classify anomalies without restricting yourself to a predefined set of anomalies

• Approach: Leverage 4-tuple header fields:

SrcIP, SrcPort, DstIP, DstPort– In particular, measure dispersion in fields

• Then, apply off-the-shelf clustering methods

Page 11: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

11

Example of Anomaly Clusters

Summary: Correctly classified 292 of 296 injected anomalies

(DstIP

)

(SrcIP)

Legend

Code Red Scanning

Single source DOS attack

Multi source DOS attack

(SrcIP)

DispersedConcentrated

Dispersed

Page 12: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

12

Summary

• Network-Wide Detection: – Broad range of anomalies with low false alarms– In papers: Highly sensitive detection, even when

anomaly is 1% of background traffic

• Anomaly Classification:– Feature clusters automatically classify anomalies– In papers: clusters expose new anomalies

• Network-wide data and header analysis are promising for general anomaly diagnosis

Page 13: Mining Anomalies in Network-Wide Flow Data Anukool Lakhina with Mark Crovella and Christophe Diot NANOG35, Oct 23-25, 2005.

13

More information

• Ongoing Work: implementing algorithms in a prototype system

• For more information, see papers & slides at:

http://cs-people.bu.edu/anukool/pubs.html

• Your feedback much needed & appreciated!