Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

30
Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies

Transcript of Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Page 1: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Frequency Analysis ofProtocols

Dr. Craig Partridge

BBN Technologies

Page 2: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

An Emerging Field

Using techniques from signal processing to better understand networks and protocols

A quick tour of the work done to dateAlong with some highly speculative

thoughts about what might come next

Page 3: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

An Overview of the Basic Concepts

Please note, I’m a systems person, not a mathematician. This talk structured for an intuitive understanding…

… although I’ll try to be rigorous where necessary

Page 4: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Step 1: Capture Packet Traces

Place taps or measuring devices in various spots in the network

For each transmission seen, capture Time Direction Duration Other stuff as desired

Network

tap

tap

tap

tap

Page 5: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Step 2: Trace to Signal

Trace is a discrete time series (time + data in non-uniform time increments)

Signal processing wants a time/amplitude series (often a uniform series)

Page 6: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Step 3: Run Feature Detection Algorithms over Signal

The meat of the task….Indeed, the signal representation you chose

is largely dictated by the algorithm you wish to run

Various algorithms extract various types of information

Rest of the talk is a survey of what has been done

Page 7: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

USC DDoS Attack

How many sources are attacking you?Capture attack packets

Convert to a uniform series x(t) = # of attack packets received in millisecond t

Condition signalSubtract mean x(t) outRemoves dominant frequency

Page 8: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

DDoS Continued

Now do auto-correlation and compute spectral densityBasically looking for frequency variations in

the attack stream over timeA uniform source would show a single stable set

of frequencies

Spectral-density: a spectrum where you show the power at each frequency

Page 9: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Wavelet-based Approach

Huang, Feldmann, WillingerFinding time structures in tracesCapture packet traces at some point

Divide into conversations/flowsUse source/destination/prefix info to do divisionDivide according to what class of traffic you wish to

analyzeConvert traces to uniform signal of 0/1

Page 10: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

More Wavelet

Compute an energy functionCompute discrete Haar wavelet transformEnergy function measures wavelet

coefficientsLow coefficients reveal regular or periodic

structure in time series

Use energy graphs to reveal periodic structure

Page 11: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Lomb Periodogram

Cousins, Krishnan, PartridgeSimilar to wavelet approachLomb periodogram:

designed for non-uniform signal traces [ideal for packets]

Computes spectral power at each frequency

Page 12: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Lomb Example

Page 13: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Example Results• Identified CBR Send Rates• Identified FTP Round Trip Times• Characteristics from all three flows observed

Page 14: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Node ID Application Frequencies

(Hz)XX0 –X00 1.0 X01 4.88X02 1.0 4.51X03 1.0

X04 1.0X05 4.88X06 1.0X07 1.0X08 4.88X09 1.0X10 4.88X11 1.0 4.51X12 1.0Xp0 1.0 24.41Xp1 1.0 24.41

Xp2 1.0 24.41Xp3 1.0 24.41Xp4 1.0 14.64Xp5 1.0 14.64

Green: Correct DetectionRed: Missed Detection

Data: 18 nodes, tcpdumpResults: • Detected 6 out of 6 application frequencies emitted• Detected 15 out of 27 traffic generators• Missed most generators emitting at 1 Hz

Spectral Techniques easily show periodic application traffic on the network

Lomb Analysis of 802.11b Data

24.41 Hz

Page 15: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

A Pause to Comment

All three approaches mentioned so far have the characteristic thatWe can detect timing structure from our dataIf we have ground-truth, we can show how

the timing structure we find relates to the timing structures in the network

But, without ground-truth, we can’t say for sure what the structure means

Page 16: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Topology Discovery

Techniques where we can show a valuable set of results, without ground-truth to interpret

Discover links in a network (wireless)CoherenceCausality

Given complete map, which links are used?Route discovery

Page 17: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Coherence

Take samples of the time series at different points in the network

Compare them, offset in timeLook for statistically significant

relationships between their spectral peaks

Page 18: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

A Sketch of the Coherence Math

Compute the Discrete Fourier TransformThis gives you a series of equally spaced points in a

spectrumThe Cross Spectral Density is an averaged

product (for each of the points in the spectrum) of the DFT of one series with the complex conjugate of the DFT from another series

Normalize the CSD to 0…1 to get coherence

Page 19: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Coherence Plots

0 500 1000

0 500 10000

0.5

1

0 500 10000

0.5

1

0 500 10000

0.5

1

0 500 10000

0.5

1

0 500 10000

0.5

1

Coherence (n0, n1)

Coherence (n0, n2) Coherence (n1, n2)

Coherence (n0, n3) Coherence (n1, n3) Coherence (n2, n3)

Page 20: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Coherence Comments

Coherence worksNicely tracks moving nodes

But coherence gets confusedFor instance, confusion over applications

with similar periodicitySometimes skips hop in path

Page 21: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Causality

Instead of related spectra, try relating individual transmissions to transmissions that came before

Define a weight function W that estimates the likelihood that event k came from a prior transmission by node i

Then the probability that an event at node i caused k is:

k

i

p = k

i

Wk

j

Wj=1

n

Page 22: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Topology Discovery Now create a conversation matrix Consider C which is the set of all events at a particular node

i. The probability that node j is sending to node i is:

ijp =

1

iCe

j

We

l

Wl=1

n

∑e∈Ci

These values define a matrix Row x is probabilities that x is sending to each of the nodes Column x is probability that x is receiving from each of the nodes

N.B.: Probability can be computed incrementally over C

Page 23: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Comments on Causality

Core idea: Over the course of a number of events, the probability function

will give enough more weight to correct sources to yield a good conversation matrix

Current W is pretty simple Exponential (Poisson) focused on most recent event Self similarity not a problem until we look fairly deep back in

time May need a more expensive weight function

Very fast… (real time analysis)

Page 24: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Egress Nodes

Extend the causality equation For each event, compute 1 minus maximum weight: the egress

weight I.e. figure weighting algorithm correctly identified source of

event, if present. If no source, this inverse will be large Define a new column of the conversation matrix that

contains the normalized average of the egress weight. Large values flag egress nodes

Page 25: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Egress Example

QuickTime™ and aYUV420 codec decompressorare needed to see this picture.

Page 26: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Stitching

Once egress nodes identified, it is possible to connect graphs efficiently

Each probe shares with its neighboring probes the traffic traces from its egress nodesTraces are combined to create a single trace

between each set of pairsRerun the topology algorithm with the

additional trace and see if a link appears

Page 27: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Stitching Example

QuickTime™ and aYUV420 codec decompressorare needed to see this picture.

Page 28: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Thoughts on Egress and Stitching

Extensions to causality analysisEgress is highly dependent on the

weighting function

Page 29: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

End-to-End Route Discovery

Discover end-to-end paths between communicating hosts (src and dst) Route: A path or sequence of links (src to dst) There may be multiple paths – need the path actually taken by data

from src to dst Require identification of active links

Can do receiver identification from conversation matrix Choose shortest paths

Break ties using “aggregate path coherence” Coherence between steps in each path

End result: Layer 3 (network) connectivity – Routing Tables

Page 30: Frequency Analysis of Protocols Dr. Craig Partridge BBN Technologies.

Some Thoughts

Progress is likely to be rapidBetter techniques

Match and latchMax-plus

Timing structure is remarkably robustE.g. Lomb showed frequency of traffic that

wasn’t visible