Deep Packet Inspection Using...

24
Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep Packet Inspection Using GPUs

Transcript of Deep Packet Inspection Using...

Page 1: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Qian Gong, Wenji Wu, Phil DeMar

GPU Technology Conference 2017

May 2017

Deep Packet Inspection Using GPUs

Page 2: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

• Main uses for network traffic analysis

– Operations & management

– Capacity planning

– Performance troubleshooting

• Levels of network traffic analysis

– Device counter level (snmp data)

– Traffic flow level (flow data)

– Packet level (The focus of this work)

• Network securities

• Application performance analysis

• Traffic characterization studies

Background

5/11/2017 Deep Packet Inspection Using GPUs, GTC’172

Page 3: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Characteristics of packet-based network traffic analysis applications

• Time constraints on packet processing

• Computing and I/O throughput-intensive

• High levels of data parallelism

• Packet parallelism. Each packet can be processed independently

• Flow parallelism. Each flow can be processed independently

• Extremely poor temporal locality for data

• Typically, data processed once in sequence; rarely reused

Background (cont.)

5/11/2017 Deep Packet Inspection Using GPUs, GTC’173

Page 4: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Packet-based traffic analysis tools face performance & scalability

challenges within high-performance networks.

– High-performance networks:

• 40GE/100GE link technologies

• Servers are 10GE-connected by default

• 400GE backbone links & 100GE host connections loom on the horizon

– Millions of packets generated & transmitted per sec

The Challenges

5/11/2017 Deep Packet Inspection Using GPUs, GTC’174

Page 5: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

• Requirements on computing platform for high performance network

traffic analysis applications

– High compute power

– Ample memory & IO bandwidth

– Capability of handling data parallelism inherent with network data

– Easy programmability

Packet-based Traffic Analysis Tool Platform (I)

5/11/2017 Deep Packet Inspection Using GPUs, GTC’175

Page 6: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

• Three types of computing platforms:

– NPU/ASIC, CPU, GPU

Packet-based Traffic Analysis Tool Platform (II)

5/11/2017 Deep Packet Inspection Using GPUs, GTC’176

Features NPU/ASIC CPU GPU

High compute power Varies ✖ ✔

High memory bandwidth Varies ✖ ✔

Easy programmability ✖ ✔ ✔

Data-parallel execution model ✖ ✔ ✔

Architecture Comparison

Features cores Bandwidth DP SP Power Price

NVidia K80 4992 480 GB/s 2.91 TF 8.73 TF 300W $4,349

Intel E7-

8890

18 102 GB/s 0.72 TF 1.44 TF 165W $7,174

NVidia K80 vs. Intel E7-8890

Page 7: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Network Traffic Analysis using GPUs

Highlights of our work:

• Demonstrated GPUs can significantly accelerate network traffic

analysis

• Designed/Implemented a generic I/O architecture to capture and

move network traffics from wire into GPU domain

• Implemented a GPU-accelerated library for network traffic analysis

Our Solution

5/11/2017 Deep Packet Inspection Using GPUs, GTC’177

Page 8: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

GPU-based Network Traffic Analysis Framework

5/11/2017 Deep Packet Inspection Using GPUs, GTC’178

Running modes

• Online analysis

– Traffic capture

• Offline analysis

GPU-based analysis

• Header analysis

• Payload analysis

Applications

• Network Monitoring

• IPS/IDS

• Traffic Engineering

• And more

NetworkWireCAP Packet

Capture Engine

Storage Libpcap library

Analyser

Configuration

System

Configuration

Online traffic

Offline traffic

Filter

(BPF)

Header

Parser

Pattern

Matching

Flow Table

(SrcIP, DstIP, SrcPort, DstPort, Proto)

Header Parsing and/or Assembly

(Suspicious packets)

Traffic

Summarization

Abnormal

Warning

Network

Monitoring

IPS/IDS

Traffic

Engineering

Header Analysis

Payload Analysis

Network Traffic Source GPU Domain Applications

Configuration

(In standard JSON format)

Page 9: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

System Architecture – Online Analysis

5/11/2017 Deep Packet Inspection Using GPUs, GTC’179

...

1. Traffic Capture2. Preprocessing GPU Domain

Traffic Analysis

KernelsUser Space

Output

3. GPU-based Analysis

4. Output

Packet

Buffer

Packets

NICs

Packet

BufferOutput

...

Capturing

CapturedData

Packet Chunks

...

Four types of logical Entities:

• Traffic Capture

• Preprocessing

• GPU-based Analysis

• Output (in JSON format)

Page 10: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

WireCAP Packet Capture Engine

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1710

• An advanced packet capture engine for commodity network

interface cards (NICs) in high-speed networks

– Lossless zero-copy packet capture and delivery

– Zero-copy packet forwarding

– A Libpcap-compatible interface for low-level network access

• WireCAP project website

– http://wirecap.fnal.gov (Note: source code is available)

Page 11: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

• A GPU-accelerated library for network traffic analysis

– Dozens of CUDA kernels

– Can be combined in a variety of ways to perform intended

analysis operations

• Two types of GPU-based network traffic analysis

– Header analysis (see our GTC’13 talk)

• http://on-demand.gputechconf.com/gtc/2013/presentations/S3146-

Network-Traffic-Monitoring-Analysis-GPUs.pdf

– Packet payload analysis

• Deep packet analysis (TCP streams)

GPU-based Network Traffic Analysis

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1711

Page 12: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Challenges in Stream Reassembly (I)

--- Parallelism

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1712

Why stream reassembly?

• Payload of packet affiliated to the same TCP stream need to be

assembled before matching against pre-defined patterns

However…

• Stream reassembly via parallel hash-table requires an atomic lock with

each hash key (TCP 4-tuple)

• Limited data parallelism when less simultaneous TCP connections are

present

T T C A

T

T

A T A KC

A K

reordering &

normalizationmatch

Page 13: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Challenges in Stream Reassembly (II)

--- Denial of Service Attack

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1713

• To address the problem of out-of-order packets, one widely adopted

approach is packet buffering and stream reassembly, i.e., buffer all

packets following a missing one, until they become in-sequence again.

• This approach is intuitive but vulnerable to denial-of-service (DoS)

attacks, whereby attackers exhaust the packet buffer capacity by sending

long segments of out-of-order packets.

A T CT

A

K

Already received and forwarded data

Buffered data

New data

Page 14: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

GPU-based Deep Packets Analysis Pipeline

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1714

packets

Packet Statistic

Analysis /Flow

Classification

Per-flow

TCP Data

Reassembly

Automaton-

based Pattern

Matching

Stream

Processing

Connection

corresponding

connection recordsto next hash bucket

hash(4-tuple)

TCB Connection Table

hk

• Intra-batch TCP packets reordering & assembly

• Inter-batch split detection

Hybrid Pattern Matching Pipeline

State

Next

Pattern matching wo/ buffering or dropping out-of-order packets

Page 15: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Observation 1

According to previous internet traffic analysis report, only 2%-5%

packets are affected by re-ordering

When processing packets in batch (~1e6 packets), 0.1%-0.5% TCP

streams spread across batches

Key Mechanisms (I)

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1715

Mechanism 1 --- intra-batch stream reassembly

+ Load packets from network to GPUs in batch

+ In-batch packet reordering and reassembly via parallel sorting

Page 16: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1716

GPU-based TCP Stream Reassembly

Packet

Reordering

raw packet

packets in flow and sequence ordersort by

(4-tuple|seq #)

p3 p2 p1 p4 p5 p9p6p7p8 p4

p1 p3 p7 p2 p4 p9p8p6p4 p5

next packet array

flow identifier

1 1 1 2 2 3332 2

filter + scan

Stream

Normalization3 7 4 5 98end end endn/a

0 n1 n2 0 n3 n6n50n4

bytes of overlapping data (prefer new data) scan seq #

n/a

Page 17: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1717

Key Mechanisms (II)

Observation 2

If a string S is matched across a list of packets P1P2…PN, the suffix of P1

must match a prefix of S, the prefix of PN must match a suffix of S, and

P2…PN-1 must match the prefixes of a suffix-S.

Mechanism 2 --- inter-batch split detection

+ Combine the Aho-Corasick (AC) and suffix-AC automatons to

detect signatures spread over different batches

Page 18: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1718

GPU-based Pattern Matching for Out-of-order Packets

Intra-batch: AC automaton

Keywords: X = {he, his, she, hers} • One thread per packet

• Each thread scans extra N bytes

towards its consecutive packet

thread k

thread k+1

thread k+2

State transition automaton Parallel execution mode

Page 19: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1719

GPU-based Pattern Matching for Out-of-order Packets

Inter-batch: AC automaton & Suffix-AC automaton

Suffix Pattern Tree (PST)

Suffix set of X: {e,is,s,he,ers,rs}

Out-of-order Packets

struct {

nextState[256];

preState;

preChar;

}PST;

suffix string = path(state)

case 1

AC state

case 2

Suffix-AC state

case 3

AC state Suffix-AC state

New packets

Received and forwarded data

Page 20: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1720

Performance Evaluation

Traffic Statistics• Traffic source: real traffics mirrored from the Fermilab gateway

• Traffic pattern (average per batch)

Base Systems:• Intel Xeon CPU E5-2650 @ 2.30 GHz, NVIDIA K40

Throughput (wo/ memory transfer)• TCP reassembly: 72.96 Mpps (⨉192 speedup comparing to libnids on CPU)

• TCP state management: 286.85 Mpps

• Pattern matching (AC & Suffix-AC): 5.83 Mpps

# of packets 1 million

# of data packets 776,207

mean packet length 1415-byte

# of connections 15,500

Page 21: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

Comparison to Existing Tools

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1721

Snort Split-Detect GASPP Ours

Computing platform CPU CPU GPU GPU

Methods Stream

Reassembly

Split

detection

Intra-batch

stream

reassembly

In-batch stream

reassembly + inter-

batch split detection

Detection over OOO

packets

✔ ✔ limited ✔

Resistance to

fragmentation flood

N Y N Y

Throughput Low Low High High

• Comparison to Snort1, Split-Detect2, and GASPP3

[1] Roesch, Martin. "Snort: Lightweight intrusion detection for networks." Lisa. Vol. 99. No. 1. 1999.

[2] Varghese, George, J. Andrew Fingerhut, and Flavio Bonomi. "Detecting evasion attacks at high speeds without

reassembly." ACM SIGCOMM Computer Communication Review. Vol. 36. No. 4. ACM, 2006.

[3] Vasiliadis, Giorgos, et al. "Design and Implementation of a Stateful Network Packet Processing Framework for

GPUs." IEEE/ACM Transactions on Networking (2016).

Page 22: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1722

Functionality Evaluation in the Presence of Adversaries

• Robust stream reassembly in facing of out-of-order packets

• Immune to SYN flood and ‘cold start’ in doing normalization

• Exempt from attacks on available buffer memory with timeout and

connection evasion mechanisms

Page 23: Deep Packet Inspection Using GPUson-demand.gputechconf.com/gtc/2017/presentation/s7468-wenji-wu... · Qian Gong, Wenji Wu, Phil DeMar GPU Technology Conference 2017 May 2017 Deep

• Extended the GPU-based deep packet analysis framework to work

with regular expressions

• Complement the header info w/ with the payload detection results

for a thorough inspection/analysis

• Optimize and evolve the GPU-based network traffic analysis

framework for 40GE/100GE networks

Future Works

5/11/2017 Deep Packet Inspection Using GPUs, GTC’1723