Detection and Identification of Network Anomalies Using Sketch Subspaces

20
Detection and Identification of Network Anomalies Using Sketch Subspaces Xin Li, Fang Bian, Mark Crovella, Christophe Dio t, Ramesh Govindan, Gian luca Iannaccone, and Anu kool Lakhina ACM Internet Measurement Conference 2006 Speaker: Chang Huan Wu 2009/5/1

description

Detection and Identification of Network Anomalies Using Sketch Subspaces. Xin Li, Fang Bian, Mark Crovella, Christophe Diot, Ramesh Govindan, Gianluca Iannaccone, and Anukool Lakhina . ACM Internet Measurement Conference 2006 . Speaker: Chang Huan Wu 2009/5/1. Outline. Introduction - PowerPoint PPT Presentation

Transcript of Detection and Identification of Network Anomalies Using Sketch Subspaces

Page 1: Detection and Identification of Network Anomalies Using Sketch Subspaces

Detection and Identification of Network Anomalies Using

Sketch Subspaces

Xin Li, Fang Bian, Mark Crovella, Christophe Diot, Ramesh Govindan, Gianluca Iannaccone, and Anukool Lakhina

ACM Internet Measurement Conference2006

Speaker: Chang Huan Wu

2009/5/1

Page 2: Detection and Identification of Network Anomalies Using Sketch Subspaces

2

OutlineIntroductionPrevious ApproachDefeatEvaluationConclusions

Page 3: Detection and Identification of Network Anomalies Using Sketch Subspaces

3

Introduction (1/3)Unusual traffic patterns arise from

network abuse as well as from legitimate activity

These traffic anomalies are often difficult to detect at a single link and require scrutiny of the entire network

Page 4: Detection and Identification of Network Anomalies Using Sketch Subspaces

4

Introduction (2/3)Characterizing “normal” traffic

using IP flows representation is intractable– High dimension

Reduce dimension and identify anomalies

Page 5: Detection and Identification of Network Anomalies Using Sketch Subspaces

5

Introduction (2/3) Previous work aggregate n

etflow into origin-destination (OD) flows

Modify this approach and increases the detection rate while reducing false alarms and identify the IP-flows responsible for the anomaly

Points of Presence, PoPLink

Page 6: Detection and Identification of Network Anomalies Using Sketch Subspaces

6

Previous ApproachReference: Anukool Lakhina, Mark Cro

vella, Christophe Diot, "Mining Anomalies Using Traffic Feature Distributions," In ACM SIGCOMM 2005

Page 7: Detection and Identification of Network Anomalies Using Sketch Subspaces

7

Volume vs. Traffic Feature Distribution

Volume based detection schemes have been successful in isolating large traffic changes– But a large of anomalies do NOT cause

detectable disruptions in traffic volume Using traffic feature distribution

– Augments volume-based anomaly detection– Traffic distributions can reveal valuable

information about the structure of anomalies

Page 8: Detection and Identification of Network Anomalies Using Sketch Subspaces

8

Port scan anomalies viewed in terms of traffic volume and in terms of entropy

But stands out in feature entropy

Port scan dwarfed in volume metrics…

Page 9: Detection and Identification of Network Anomalies Using Sketch Subspaces

9

Traffic Feature Distributions Anomalies can be detected and distinguished

by inspecting traffic features:– 4-tuple: SrcIP, SrcPort, DstIP, DstPort

Page 10: Detection and Identification of Network Anomalies Using Sketch Subspaces

Entropy based scheme In volume based scheme, # of packets or bytes per tim

e slot was the variable. In entropy based scheme, in every time slot, the entrop

y of every traffic feature is the variable. This gives us a three way data matrix H.

– H(t, p, k) denotes at time t, the entropy of OD flow p, of the traffic feature k.

To apply subspace method,we need to unfold it into a single-way representation.

Page 11: Detection and Identification of Network Anomalies Using Sketch Subspaces

11

Normal subspace, : first k principal components

Anomalous subspace, : remaining principal components

Then, decompose traffic on all links by projecting onto and to obtain:

11

Traffic vector at a particular point in time

Normal trafficvector

Residual trafficvector

Subspace Decomposition

Page 12: Detection and Identification of Network Anomalies Using Sketch Subspaces

12 12

Traffic on link 1

Traffic

on lin

k 2

y

In general, anomalous traffic results in a large value ofUse to identify if it is anomalous

Geometric illustration

Page 13: Detection and Identification of Network Anomalies Using Sketch Subspaces

Multiway Subspace Method:(Multi-way to single-way)

Decompose into a single-way matrix Now apply the usual subspace decomposition

(PCA)– Every row of the matrix will be decomposed into

# od-pairs

# tim

ebins

H(SrcIP) H(SrcPort) H(DstPort)H(DstIP)

H(s

rcIP

) H(d

stIP

)

H(s

rcPor

t)

H(d

stPor

t)

# od-pairs# od-pairs

# tim

ebins

# tim

ebins

H(SrcIP) H(SrcPort) H(DstPort)H(DstIP)

H(s

rcIP

) H(d

stIP

)

H(s

rcPor

t)

H(d

stPor

t)

Page 14: Detection and Identification of Network Anomalies Using Sketch Subspaces

14

Defeat (1/2)

Use random aggregations of IP flows (sketches) Put an IP flow into different hash functions (h1,

h2…)

h1

h2

h3

h4

h5

s buckets

R1, SrcIP

h1

h2

h3

h4

h5

s buckets

R2, SrcIP

h1 s bucketsh1 s buckets

Entropy of h1

Entropy of h1

Entropy of h1

t1

t2

Entropy of h1tn

R1

R2

Page 15: Detection and Identification of Network Anomalies Using Sketch Subspaces

15

Defeat (2/2)

Apply multiway subspace method to each hash function

In all m hash functions, see how many ones are identified as anomalous– Voting approach

Entropy of h1

Entropy of h1

t1

t2

Entropy of h1tn

SrcIPEntropy of h1

Entropy of h1

Entropy of h1

SrcPortEntropy of h1

Entropy of h1

Entropy of h1

DstIPEntropy of h1

Entropy of h1

…Entropy of h1

DstPort

Page 16: Detection and Identification of Network Anomalies Using Sketch Subspaces

16

Identify Anomalies

Find the element in hash functions that is identified as anomalous

The intersection of the key sets over all hash functions which has raised the alarms, identifies the keys of the IP flows that caused the anomaly (with high likelihood)

Entropy of h1t1 Entropy of h2 Entropy of h3 Entropy of h4

s buckets s buckets s buckets s buckets

Page 17: Detection and Identification of Network Anomalies Using Sketch Subspaces

17

Evaluation (1/2)

Page 18: Detection and Identification of Network Anomalies Using Sketch Subspaces

18

Evaluation (2/2)

5 or 6 hash functions is enough If m is the number of hash functions, m−2 or more votes may b

e enough

Page 19: Detection and Identification of Network Anomalies Using Sketch Subspaces

19

ConclusionUses multiple random traffic

projections to robustly detect anomalies

Higher detection rate and fewer false alarms

Able to automatically infer the IP flows responsible for an anomaly

Page 20: Detection and Identification of Network Anomalies Using Sketch Subspaces

20

CommentsOnly can handle offline dataCan other fields in packet header

be used for anomaly detection?