Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan...

31
Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1 , Vern Paxson 2 , Holg er Dreger 1 , Anja Feldmann 1 , Robin S ommer 1 1 TU München, 2 ICSI/LBNL Internet Measurement Conference (I MC) 2005

Transcript of Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan...

Page 1: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

Building a Time Machine for

Efficient Recording and Retrieval of High-Volume Network TrafficStefan Kornexl1, Vern Paxson2, Holger Dreger1,

Anja Feldmann1, Robin Sommer1

1TU München, 2ICSI/LBNL

Internet Measurement Conference (IMC) 2005

Page 2: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 2

Reference

Stenfan Kornel, Vern Paxson, Holger Dreger, Anja Feldmann, Robin Sommer, “Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic,” 5th ACM IMC 2005.

“High-Performance Packet Recording for Network Intrusion Detection,” master thesis by Stefan Kornexl, 2005.

Time Machine webpage http://www.net.t-labs.tu-berlin.de/research/tm/

Page 3: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 3

Outline

Motivation and Goals

Feasibility Study (trace-driven simulation)

System Architecture

Performance Evaluation

Conclusion and Comments

Page 4: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 4

Motivation

The availability of packet recording is considered a big benefit for network security monitoring Security forensics

Determining how an attacker compromised a given host Network trouble-shooting

Inspecting the precursors to a fault after the fault Event correlation

NIDS could analyze past events that are not considered “interesting” until more recently seen traffic hinted at their relevance

Page 5: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 5

Problems

Looking at raw packets (not only headers but full contents)

Storage constrains In many operational environments, it’s infeasible to capture the

entire traffic stream due to the enormous volume of the traffic

Problems for data filtering Hard to decide beforehand what context will turn out to be

relevant retrospectively to investigate incidents Filtering still becomes technically problematic in high speed

network (n TB per day)

Data retrieval is like finding needle in a haystack It’s time-consuming and cumbersome

Page 6: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 6

Related Work

Brute-force bulk-recording Only in low volume environments

Record those packets that trigger alerts Do not support retrospective analysis of a problematic

host’s earlier activity

Sampling – might loose important evidence Data abstraction – provide less information

Page 7: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 7

Objection

Design and implement a packet recording system “Time Machine” Use dynamic packet filtering and buffering to enable

effective recording of large traffic stream Nearly complete historic data for several days

Allowing to conveniently “travel back in time”

Application: E.g., a forensic tool – to extract detailed past information

about unusual activities once they are detected

Page 8: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 8

The Approach

Observation (key insight): “Heavy-tailed” distribution in network traffic

Most network connections are quite short

Only a small number of large connections accounting for the bulk of the total volume

Compromising is at the beginning of most attacks For forensics and trouble-shooting applications the

beginning of a large connection contains the most significant information

Page 9: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 9

The Approach (cont’d)

Exploit the “heavy-tailed” nature to partition the traffic stream into a small subset of high interest vs. a large remainder of low interest Then record the small subset and discard the rest

Cutoff limit, N: For every connection, it buffers up to the first N bytes

of traffic Greatly reduce the traffic we must buffer Retain full context for small connections and the

beginning for large connections

Page 10: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 10

Design Goals for the Time Machine Provide raw packet data

Buffer traffic comprehensively

Prioritize traffic

Automated resource management

Efficient and flexible retrieval

Suitable for high-volume environments using commodity hardware

Page 11: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 11

Outline

Motivation and Goals

Feasibility Study (trace-driven simulation)

System Architecture

Performance Evaluation

Conclusion and Comments

Page 12: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 12

Environments

MWN Munich Scientific Research Network in Munich, Germany About 50,000 hosts, 2 TB/day 15-20% FTP traffic 350 Mbps (68 Kpps) at busy-hour

LBNL Lawrence Berkeley National Laboratory in California, USA About 9,000 hosts & 4,00 users 320 Mbps (37 Kpps) at busy-hour

NERSC National Energy Research Scientific Computing Center About 600 hosts & 2,000 users (dominated by large transfers) 260 Mbps (43 Kpps) at busy hour

Page 13: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 13

Datasets

Connection-level summaries (1 week) collected by Bro NIDS MWN – 355 million connections (from 2004/10/18) LBNL – 22 million connections (from 2005/2/7) NERSC – 4 million connections (from 2005/4/29)

These logs capture the nature of their environments but with a relatively low volume compared to full packet-level data

Use packet-buffer model to simulate packet-level communication and evaluate the memory requirements of a Time Machine

Page 14: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 14

Heavy-tailed Distribution and the Cutoff

(log-log scaled)

Cutoff = 20 KB≈ 90% connections(record)

≈ 10% connections(discard)

12% 14%

15%

Their bytes:(discard)

NERSC 99.86%LNBL 96%MWN 87%

Page 15: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 15

Evaluate the Memory Requirements Eviction time, Te :

How long the buffer stores each connection’s data (The goal) aim for a value of Te on the order of days

rather than minutes

Changing the cutoff N and the eviction time Te to evaluate the efficiency (feasibility) of a Time Machine Results: using a cutoff of 10-20 KB, buffering several

days of traffic is practical

Page 16: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 16

Required Memory for LBNL

64 GB68 GB

Increase the duration of data availability by a factor of 32 (3h vs. 4d)

Stop to increase after 4 days, since the constrain of eviction time Te

5th day

Page 17: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 17

Required Memory for NERSC

344 GB

14.9 GB

NERSC has large proportion of high-volume traffic(14% connections -> 99.86% bytes)

• Without a cutoff, the volume is spiky

• Te only is helpless for volume because of the intermittent bursts of traffic

Page 18: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 18

Required Memory for MWNMWN has lower fraction of bytes in the larger connections

(15% connections -> 87% bytes)The gain from the cutoff is not quite as large, likely due tothe larger fraction of HTTP traffic

Page 19: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 19

Outline

Motivation and Goals

Feasibility Study (via trace-driven simulation)

System Architecture

Performance Evaluation

Conclusion and Comments

Page 20: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 20

Time Machine System Architecture 4 Main Functions :

1. buffering traffic using a cutoff

3. providing flexible retrieval of subsets of the packets

4. enabling customization

2. migrating the buffered packets to disk and managing the associated storage

Page 21: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 21

Two-thread Architecture

Separates user interaction from recording to ensure that packet capture has higher priority than packet retrieval

Page 22: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 22

Packet Capture

The capture unit Receive packets from network tap and passes them o

n to the classification unit

Use libpcap packet capture library to collect and store each packet’s full content and capture timestamp libpcap can specify a kernel-level BPF (BSD Packet Fil

ter) capture filter to discard “uninteresting” traffic as early as possible

Page 23: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 23

Classification

The classification unit Divide the incoming packet stream into user-defined

classes Assign packets to different storage containers based on their classes

Responsible for monitoring the cutoff with the help of the connection tracking unit Connection tracking unit keeps per connection statistics and checks if

the connection the packet belongs to has exceeded its cutoff threshold

An example of the “telnet” class:name BPF filter (rule)

priority cutoff

memory and disk buffer size

Page 24: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 24

Storage Containers

The architecture supports customization by splitting the overall storage into several storage containers Each storage container is responsible for storing a

subset of packets within the resources (memory/disk) According to the user defined classes

RAM and disk buffers are implemented as two ring buffers Packets evicted from the RAM buffer are migrated to the

disk buffer And eventually be deleted

Page 25: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 25

Indexing

For efficient retrieval Use an index across all packets stored in all storage containers Each index manages a list of time intervals for every unique key

value Update [Tstart, Tend] for each key (each incoming packets)

The time intervals provide information on whether packets with that key value are available in a given storage container and at what starting timestamp Just scan linearly through the intervals it gets from the index

Multiple indexes Support any number of indexes over an arbitrary set of protocol h

eader fields

Page 26: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 26

Query Processing

Provides a flexible language to express queries for subsets of the packets

Each query consists of a logical combination of time ranges, keys, and an optional BPF filter

1. Check index, get the time range of the query.

2. Locate the time ranges in the storage containers using binary search

3. Scanning all packets in the identified time ranges and checking if they match the query

4. Writing the results to a tcpdump trace file on disk

Page 27: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 27

User Interface

Allows the user to configure the recording parameters Classification rules, cutoff, storage management, inde

xing Δt, etc. Issues queries to the query processing unit to ret

rieve subsets of the recorded packets

Page 28: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 28

Outline

Motivation and Goals

Feasibility Study (via trace-driven simulation)

System Architecture

Performance Evaluation

Conclusion and Comments

Page 29: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 29

Evaluation in LBNL Configuration:

3 classes, each witha 20KB cutoff:

• TCP 90GB • UDP 30GB • Others 10GB

Retention:The distance back in time to which wecan travel at any particular moment

Increases after the Time Machine startsuntil the disk buffers have filled

Correlates with the incoming bandwidth for each class and its variations due to diurnal and weekly effects

Page 30: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 30

Evaluation

In LBNL 98% of the traffic gets discarded The remainder imposes an a average (maximum) rate of 300 KB/

s (2.6 MB/s) Over the 2 weeks of operation libpcap reported only 0.016% of all

packets dropped In MWN

85% of the traffic gets discarded Average (maximum) rate of 3.5 MB/s (13.9 MB/s)

larger volume of HTTP traffic Issues: need to more aggressively exploit the classification and c

utoff mechanisms to appropriately manage the large fraction of HTTP traffic

Page 31: Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic Stefan Kornexl 1, Vern Paxson 2, Holger Dreger 1, Anja Feldmann.

2007/12/21 Speaker: Li-Ming Chen 31

Conclusion

A concept of a Time Machine for efficient network packet recording and retrieval is proposed Relies on the “heavy-tailed” nature of network traffic Record most connections in their entirety and skip the bulk of the

total volume

Time Machine Can buffer several days of raw high-volume traffic using commod

ity hardware Provides an efficient query interface Automatically manages its available strorage

Using a trace-driven simulation and real experience to demonstrate the effectiveness