A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos...

23
A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002

Transcript of A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos...

Page 1: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

A Signal Analysis of Network Traffic Anomalies

Paul Barfordwith Jeffery Kline, David Plonka, Amos Ron

University of Wisconsin – Madison

Summer, 2002

Page 2: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

2

Motivation

• Traffic anomalies are a fact of life in computer networks– Outages, attacks, etc…

• Anomaly detection and identification is challenging– Operators typically monitor by eye using SNMP or IP flows

• Obviously, this does not scale!– Simple thresholding is ineffective– Some anomalies are obvious, other are not

• Characteristics of anomalous behavior in IP traffic are not well understood– Do same types of anomalies have same characteristics?– Can characteristics be effectively used in detection systems?

Page 3: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

3

Introduction

• Objective: Improve our understanding network traffic anomalies

• Approach: Wavelet analysis of data set that includes IP flow data, SNMP data and a catalog of observed anomalies

• Method: Integrated Measurement Analysis Platform for Internet Traffic (IMAPIT)

• Results: We demonstrate how anomalies can be exposed using wavelets and develop new method for exposing short-lived events

Page 4: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

4

Related Work

• Network traffic characterization– Eg. Caceres89, Leland93, Paxson97, Zhang01

• Focus on typical behavior

– Abry98 use wavelets to analyze LRD traffic

• Fault and anomaly detection techniques– Eg. Feather93, Brutlag00

• Focus on thresholds and time series models

– Eg. Paxson99• Rule based tool for intrusion detection

– Eg. Moore01• Backscatter technique can be used to identify DoS attacks

– Eg. Huang01• Wavelet-based approach to detecting network performance problems

Page 5: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

5

Simple Network Management Protocol

• SNMP is the standard protocol for monitoring/managing networked systems

• SNMP defines a set of MIB (management information base) data exported from routers– RFC2863

• We sample High Capacity Interface using MRTG (Multi-Router Traffic Grapher) at 5 minute intervals– Archive byte and packet traffic in each direction– 64-bit counters on each of 15 WAN links

• SNMP count precision is yet to be determined…

Page 6: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

6

IP Flows• An IP Flow is defined as a unidirectional series of

packets between source/dest IP/port pair over a period of time

– Exported by Lightweight Flow Accounting Protocol (LFAP) enabled routers (Cisco’s NetFlow, Juniper cflowd flow export)

• We use FlowScan [Plonka00] to collect and post-process IP flow data collected at 5 minute intervals– Combines flow collection engine, database, visulaization tool

– Provides a near real-time visualization of network traffic

– Breaks down traffic into well known service or application

{SRC_IP/Port,DST_IP/Port,Pkts,Bytes,Start/End Time,TCP Flags,IP Prot …}

Page 7: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

7

Page 8: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

8

Our Approach to Data Gathering• Consider anomalies in IP flow and SNMP data

– Collected at UW border router (Juniper M10)– Archive of ~6 months worth of data (packets, bytes, flows)– Includes catalog of anomalies (after-the-fact analysis)

• Group observed anomalies into four categories– Network anomalies (41)

• Steep drop offs in service followed by quick return to normal behavior– Flash crowd anomalies (4)

• Steep increase in service followed by slow return to normal behavior– Attack anomalies (46)

• Steep increase in flows in one direction followed by quick return to normal behavior– Measurement anomalies (18)

• Short-lived anomalies which are not network anomalies or attacks

Page 9: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

9

Our Approach to Analysis• Wavelets provide a means for describing time series data

that considers both frequency and time– Particularly useful for characterizing data with sharp spikes

and discontinuities• More robust than Fourier analysis which only shows what frequencies

exist in a signal

– Tricky to determine which wavelets provide best resolution of signals in data

• We use tools developed at UW which together make up IMAPIT– FlowScan software– The IDR Framenet software

Page 10: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

10

Our Wavelet System

• After evaluating different candidates we selected a wavelet system called Pseudo Splines(4,1) Type 2.– A framelet system developed by Daubechies et al. ‘00– Very good frequency localization properties

• Three output signals are extracted from input– Low Frequency (L): synthesis of all wavelet coefficients

from level 9 and up– Mid Frequency (M): synthesis of wavelet coefficients 6, 7, 8– High Frequency (H): synthesis of wavelet coefficients 1 to 5

• Thresholding (set to zero all coefficients whose absolute value is below a threshold) is used on these coefficients

Page 11: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

11

Ambient IP Flow Traffic

Page 12: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

12

Ambient SNMP Traffic

Page 13: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

13

Byte Traffic for Flash Crowd

Page 14: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

14

Average Packet Size for Flash Crowd

Page 15: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

15

Flow Traffic During DoS Attacks

Page 16: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

16

Byte Traffic During Measurement Anomalies

Page 17: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

17

Anomaly Detection via Deviation Score

• We develop an automated means for identifying short-lived anomalies based on variability in H and M signals

1. Compute local variability (using specified window) of H and M parts of signal

2. Combine local variability of H and M signals (using a weighted sum) and normalize by total variability to get deviation score V

3. Apply threshold to V then measure peaks• Our analysis shows that V peaks over 2.0 indicate

short-lived anomalies with high confidence– We threshold at V = 1.25 and set window size to ~3 hours

Page 18: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

18

Deviation Score for Three Anomalies

Page 19: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

19

Deviation Score for Network Outage

Page 20: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

20

Anomalies in Aggregate Signals

Page 21: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

21

Hidden Anomalies in Low Frequency

Page 22: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

22

Deviation Score Evaluation• How effective is deviation score at detecting anomalies?

– Compare versus set of 39 anomalies• Set is unlikely to be complete so we don’t treat false-positives

– Compare versus Holt-Winters Forecasting• Sophisticated time series technique• Requires some configuration

• Holt-Winters reported many more positives and sometimes oscillated between values

Total Candidate Anomalies

Candidates detected by Deviation

Score

Candidates detected by

Holt-Winters

39 38 37

Page 23: A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.

23

Conclusion and Next Steps• We present an evaluation of signal characteristics of network traffic

anomalies– Using IP flow and SNMP data collected at UW border router

• 106 anomalies have been grouped into four categories

– IMAPIT developed to apply wavelet analysis to data– Deviation score developed to automate anomaly detection

• Results– Characteristics of anomalies exposed using different filters and data– Deviation score is effective detection method

• Future– Development of anomaly classification methods– Application of results in (distributed) detection systems