1
Deep Packet Inspection(DPI) Engineering for Enhanced Performance of Network Elements and Security Systems
PIs: Dr. Anat Bremler-Barr (IDC)Dr. David Hay (HUJI)
www.deepness-lab.org
2
• Deepness Lab was founded in November 2010 • Our mission: Deep Packet Inspection (DPI) for Next
Generation Network devices• Funding:
• 5 years ERC Starting Grant (1M Euro)• 3 years Kabarnit, a Magnet program ($70K/year) • A gift from Cisco ($75K)
• Main Industry Collaborations: Commtouch, Radware, Verint
3
People
Faculty: Anat Bremler-Barr (IDC Herzliya), David Hay(The Hebrew University of Jerusalem)Postdoc : Shimrit Tzur-David, Yaron KoralPh.D. StudentsLiron Schiff (Tel Aviv University), Yotam Harchol (The Hebrew University of Jerusalem)Collaborators:Yehuda Afek (Tel Aviv University), Isaac Keslassy (Technion),Shir Landau-Feibish (Tel Aviv University)Past StudentsVictor Zigdon, M.Sc. (IDC Herzliya),Adam Mor, M.Sc. (IDC Herzliya)
People
Dr. Anat Bremler-Barr - Ph.D. with distinction, Tel-Aviv University, Israel (2001). Founder and chief scientist of Riverhead Networks (focused on distributed denial of service solution, and was acquired by Cisco). Senior lecturer (assistant professor) with tenure at IDC.
Dr. David Hay - Ph.D. from the Technion (2007). Post-doc at Columbia University, NY, USA and Politecnico di Torino. Previously, also at IBM Research and Cisco San Jose. Senior lecturer (assistant professor) at the Hebrew U.
5
Deep Packet Inspection (DPI)
• DPI - Identifying signatures (patterns or regular expressions) in the packets’ payload
• DPI is the main action taken to inspect traffic and therefore it is a critical component in next generation networks: security, content filtering, traffic monitoring, load balancing, lawful interception, targeted advertising, data leakage prevention, application-aware routing ….
• High-speed DPI is challenging and quickly becomes the bottleneck of the entire packet inspection process.
6
Impact
• 66% of network network equipment vendors define DPI as “a must have” technology today [Heavy Reading Survey, 2011]
• DPI market on 2011 estimated at $550 million, growth of 20%/year [Qosmos report, Heavy Reading, Dec. 2012]
7
Major Challenges
• Scalability: – Rate - greater than 10 or even 100 Gbps – Memory - handling thousands of signatures– Power - educing the high power consumption
• Compressed traffic• Security of the NIDS itself:
– Current solutions are vulnerable to Denial of Service attack
• DPI in Software Defined Networks• Signatures Extraction
Compressed HTTP
• 84.1% of the top 1,000 sites compress their traffic. • Data compression is done by adding references to
repeated data.• There are two types of compression:
– Intra-response compression – the references point to bytes within the response (Gzip/Deflate)
– Inter-responses/connections compression – the references point to bytes in a separate file, called dictionary (Google’s SDCH).
15
19% increase in 8 month!
16
Challenges
Current security tools do not deal with compressed traffic due to the great challenges in time and space
Compressed Traffic : Space Challenge
• Thousands of concurrent sessionsCompressed, Mem: 32KB/session Uncompressed Traffic
unzip
Space Time
80% 40%
Contribution:
Improve
DPI
• General belief:
• Our algorithms show how to accelerate the pattern matching using the compression information
Compressed Traffic : Time Challenge
18
Decompression + pattern matching >> pattern matching
Decompression + pattern matching < pattern matching
High-Level Idea
• Compression is done by compressing repeated sequences of bytes
• Store information about the pattern matching results No need to fully perform again pattern matching on repeated sequences which were already scanned x2-3 time reduction
• The buffers needed for decompression are not used most of the time, and therefore can be kept in compressed form most of time x5 space reduction
19
23
The Other Side of the Coin: Acceleration by Identifying repetitions in uncompressed Traffic
There are repetitions in uncompressed HTTP traffic– Entire files (e.g., images)– Parts of the files (e.g., HTML tags, javascripts)
We keep scanning again and again the same thing (and get the same scanning results..)
1. Identify frequently repeated dataStored in a dictionary
2. Perform DPI on the data once and remember the resultsDPI by pattern matching Aho-Corasick algorithm. Result is the
state.
3. When encountering a repetition, recover the state without re-scanning
Delicate points need to be taken care of, so we won’t miss any pattern
Complexity DoS Attack Over NIDS
• Easy to craft – very hard to process packets
• 2 Steps attack:
Attacker
Internet
2. Sneak into the network
1. Kill IPS/FW
Attack on Security Elements
Combined Attack:DDoS on Security Element
exposed the network – theft of customers’
information
System Architecture
P
rocessor
Ch
ip
Core #8N
IC Core #1Q
Core #2Q
Q
Q
Q
Detects heavy
packets
Core #9
Core #10
Routine Mode:
Load balance between cores
System Architecture
P
rocessor
Ch
ip
Core #8Dedicated Core
#9
NIC Core #1Q
Core #2Q
Q
QB
Dedicated Core #10
BQ
Detects heavy
packets
Alert Mode:Dedicated cores for heavy packets
Others detect and move heavy to Dedicated.
B
B
32
Cloud solution
• The different cores are different (virtual) machines.
• Load balancing sends heavy packets to machines that run a special more efficient processing method.
• In SDN, this can be done even faster and easier.
0
1
2
3
4
5
6
7
8
9
SRAMSearch Key
0011101010*********************
34
TCAM – Ternary Content- Addressable Memory
Enc
oder
Match lines
0
1
2
3
4
6
5
7
8
9
deny
accept
accept
denydeny
deny
denydeny
log
accept
1110*********0101001010101010**
1110101010100101001************
*******************************
************************* 001110
0011101010101******************
1111111111111111111111111111***
0011101010101001110001110001110
0
0
0
1
0
1
0
1
0
1
3
De-facto solution of packet classification.
Core component of SDN switch
Action
TCAM
1110101010100101001*********011
1110101010100101001*******11111
1110101010100101001********1111
35
Some Challenges In Using TCAM
• Reducing the number of entries power consumption reduction
• Dealing with ranges (how to encode the range [1-6]?)
• How to correct errors?– More about it in the next slide
• How to use it for non-traditional tasks– Traditionally, TCAM is used for IP lookup and header
classification (e.g., using 5-tuples)
Example: Error Correction in TCAM
• In SRAM (or any regular memory)– Input: address (entry number)– Output: content of that address– One can apply an error detection/correcting code
on that content • In TCAM
– Even if the content seems OK, we still have false miss or indirect false miss errors, TCAM EDC/ECC are harder
PEDS: Parallel Error Detection Scheme for TCAM Devices• Detecting all errors using the built-in parallel
lookup of the TCAM • The number of lookups is a function of the
width of the TCAM word, and not the number of entries in the database.– Which is 3 orders of magnitude larger
• Developed, patented in DEEPNESS lab
CompactDFA for DPI
• Using TCAM to represent a huge DFA in a compact manner.
• Reducing the problem of pattern matching to IP lookup (much easier problem)
• Each byte scan one TCAM lookup– Can be reduced using variable stride traversal– Further performance boost with parallelism and
pipelining
38
Next State Sym Current
0000 (s0) A 0000 (s0) 1
0110(s6) B 0000(s0) 2
1100 (s12) C 0000 (s0) 3
0000 (s0) D 0000 (s0) 4
0001(s1) E 0000 (s0) 5
0000 (s0) F 0000 (s0) 6
0000 (s0) A 0001(s1) 7
0010 (s2) B 0001(s1) 8
0000 (s0) C 0001(s1) 9
0000 (s0) D 0001(s1) 10
0000 (s0) E 0001(s1) 11
0000 (s0) F 0001(s1) 12
0000 (s0) A 0010(s2) 13
0100 (s4) B 0010(s2) 14
0011 (s3) C 0010(s2) 15
0000 (s0) D 0010(s2) 16
0000 (s0) F 1101 (s13) 84
DFA CompactDFASnort: 73MB 0.6MBClamAV: 1.5GB 26MB
SRAMTCAM
Longest Prefix Match
41
Zombies on innocent computers
Current DDoS Attack• Armies of zombies Many sources • Hard to identify behaviorally• No known signatures
Server-level DDoS attacks
Infrastructure-level DDoS attacks
Bandwidth-level DDoS attacks
42
Automated Extraction of Signatures for Zero-day Internet Attacks • Input:
• sample of attack traffic (high volume attack)• sample of normal traffic
Output: Automatically find signatures that appear frequently only during attack
• Where:– Input collection:
• In mitigation apparatus (DDoS Guard/firewall/anti-DDoS etc.)• In the cloud – collect data from several collectors.
– DDoS – power computation saving
– Signatures used by anti-DDoS devices and firewalls to stop attack • Mitigation in minutes, good enough for these types of attacks
Top Related