Protomatching Network Traffic for High Throughput Network Intrusion Detection
description
Transcript of Protomatching Network Traffic for High Throughput Network Intrusion Detection
Protomatching Network Traffic for High Throughput Network Intrusion Detection
Shai Rubin Somesh Jha Barton P. Miller
MicrosoftSecurity Analysis Services
University of Wisconsin Comp. Sciences
University of WisconsinComp. Sciences
Presented by Zhaosheng Zhu
2
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Attacker NetworkNIDS
Signature database
3
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Attacker NetworkNIDSGET <URL>/cmd.exe HTTP/1.1\n
•“cmd.exe” is the attack pattern
Signature database
cmd.exe
4
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Shai NetworkNIDS
•“cmd.exe” is the attack pattern
Signature database
cmd.exe
Be aware of the “cmd.exe” attack
5
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Attacker NetworkNIDSGET <URL>/cmd.exe HTTP/1.1\n
• “cmd.exe” is the attack pattern,• but only if it is part of a URL
Signature database
cmd.exe
6
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Attacker NetworkNIDS
• “cmd.exe” is the attack pattern,• but only if it is part of a URL, • and the HTTP method is GET
Signature database
cmd.exe
POST <URL>/cmd.exe HTTP/1.1\n
7
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Attacker NetworkNIDS
• “cmd.exe” is the attack pattern,• but only if it is part of a URL, • and the HTTP method is GET,• and takes into account upper-lower case characters,
Signature database
cmd.exe
GET <URL>/CMD.exe HTTP/1.1\n
8
Signature evolution
• Informally, a signature is usually defined as “a characteristic pattern of the attack”.
Attacker NetworkNIDS
• “cmd.exe” is the attack pattern,• but only if it is part of a URL, • and the HTTP method is GET,• and takes into account upper-lower case characters, • and takes into account HTTP encodings
Signature database
cmd.exe
GET <URL>/%43MD.exe HTTP/1.1\n
9
Problem in This Talk
cmdattack
A traditional signature
cmd.exeattack
A traditional signature
TCP streams
TCP streams
What we specify: a traditional signature that exposes:• false negatives• false positives
What we enforce: a signature that inherently fits the attack.
Goal: Develop a signature that is cheaper to enforce
10
Contributions
• Conceptual: Protomatching signature
• Practical: Superset Protomatcher
• Real world impact: 25% improvement in Snort performance
11
Protomatching Signature
• It is a regular expression with two properties: – Ensures that the characteristics pattern of an
attack appears in the context that is necessary for the attack to succeed.
– Second, a protomatching signature matches both normalized and encoded versions of an attack.
12
Superset protomatcher
• It recognizes a superset of the traffic
matched by a full-coverage protomatcher.
• Three properties:– A superset protomatcher consumes less memory.– Traffic that matches the superset protomatcher
may do not match any NIDS signatures– Traffic that does not match the superset
protomatcher also does not match any signature in the NIDS database.
13
Related work
• Protocol analysis and traffic normalization– Modern NIDS are based on the ANM
methodology.– Ptacek and Newsham were the first to recognize
that a NIDS that does not perform normalization is susceptible to evasion.
– The problem of alternate encodings is particularly painful for HTTP traffic.
14
Related Work II
• Fast pattern matching for NIDS– Previous work does not solve encodings problem,
and does not consider protocol analysis in matching algorithm
– Researchers have proposed using regular expression matching
– To match regular expressions, Sommer and Paxson used a DFA. However, they performed matching on already-normalized traffic.
15
Related Work III
• Dealing with high-speed links.– To deal with high-speed links, researchers have
suggested a distributed NIDS that balances the network traffic such that each sensor monitors a different portion of the protected network
– Our work focuses on the performance of a single sensor. It can perform better with cooperating distributed design.
16
Analyze-normalize-match (ANM) approach
• First, a NIDS encodes its signatures in a normalized form
• During runtime, NIDS parses the traffic according to the protocol the attack uses and normalizes the traffic
• Last, the NIDS matches the normalized traffic against its normalized signatures.
17
Current conversion and signature matching
GET <…>/%43MD.exe HTTP/1.1\n
Protocol analysis
Sig=CMD.EXE
•Naively, each phase requires traversing the input
• In practice (e.g., Snort) two traversals:
• Protocol analysis + normalization• Matching
• Notice that all traffic, benign and malicious, requires all three phases
Method = GETURL = <…>/%43MD.exeVersion = HTTP/1.1
Normalization
URL=CMD.EXE
String matching
Malicious Benign
YesNo
18
Protomatching
GET <…>/%43MD.exe HTTP/1.1\n
Protocol analysis
Sig=CMD.EXE
Method = GETURL = <…>/%43MD.exeVersion = HTTP/1.1
Normalization
URL=CMD.EXE
Pattern matching
Malicious Benign
YesNo
GET <…>/%43MD.exe HTTP/1.1\n
Malicious Benign
YesNo
Sig=????
•Goal: Single traversal on the input •Protomatching=Protocol analysis+ Normalization+Matching
19
Protomatching
GET <…>/%43MD.exe HTTP/1.1\n
Protocol analysis
Sig=CMD.EXE
Method = GETURL = <…>/%43MD.exeVersion = HTTP/1.1
Normalization
URL=CMD.EXE
Pattern matching
Malicious Benign
YesNo
GET <…>/%43MD.exe HTTP/1.1\n
Malicious Benign
YesNo
Sig=Regular expression
Single pass implies: use a Deterministic Finite State Machine
20
Converting a traditional signature into a protomatching signature
1. Let S be a traditional signature
2. Expand S to conform to the protocol specification
21
Traditional signature
• *[c|C][m|M][d|D].[e|E][x|X][e|E] •8 states•size = 8*256=2048 bytes
22
Add a little bit of context
• *”GET”*[c|C][m|M][d|D].[e|E][x|X][e|E] •12 states•size = 12*256=3072 bytes
23
And even more context
• (*\n\n)*”GET”[SP]+(PN)*[c|C][m|M][d|D].[e|E][x|X][e|E] •18 states•size = 18*256=4608 bytesSP denotes white space characters, and PN denotes charactersthat can appear in a URL according to the HTTP specification(e.g., ‘\n’ cannot appear in a URL).
24
Converting a traditional signature into a protomatching signature
1. Let S be a traditional signature
2. Expand S to conform to the protocol specification, obtaining S’
3. Expand S’ to account for all possible encodings, obtaining S’’
25
Representing encodings
The character c can be represented as: C, c, %43, %63, %U0043, %U0063, %u0043, %u0063
Replace every instance of the small machine with the large machine
26
And even more context
• (*\n\n)*”GET”[SP]+(PN)*[c|C][m|M][d|D].[e|E][x|X][e|E] •18 states•size = 18*256=4608 bytes
27
*\n\n”GET”[SP]+(PN)*[c-C][m-M][d-D].[e-E][x-X][e-E]and HEX encoding and Uencoding
• 53 states•size = 53*256=13,568 bytes
28
Building a protomatcher
1. Let S be a traditional signature2. Expand S to conform to the protocol specification,
obtaining S’3. Expand S’ to account for all possible encodings,
obtaining S’’4. Perform 1-3 for every traditional signature in your
database, obtaining S1’’, S2’’,…,Sn’’5. Build the protomatcher: an FSM that identifies S1’’S2’’,
…,Sn’’ Problem: we increased each signature by factor of 7 (at least).A full protomatcher does not fit into 2GB (or 4GB) of memory
29
Superset protomatching signature
• Assumption: the majority of the benign traffic is not only benign, but also not even similar to malicious traffic.
• For example, most benign traffic not only does not contain “cmd.exe”, but also does not contain “cmd.”
• Note that is a request does not contain “cmd.”, then it also does not contains “cmd.exe”
• “cmd.” is a superset signature because it matches the attack and more
30
Full protomatching signature for cmd.exe
• *\n\n”GET”[SP]+(PN)*[c-C][m-M][d-D].[e-E][x-X][e-E]and HEX encoding and Uencoding •53 states•size = 53*256=13,568 bytes
31
Superset protomatching signature for cmd.exe• *\n\n”GET”[SP]+(PN)*[c-C][m-M][d-D].[e-E][x-X][e-E]and HEX encoding and Uencoding •29 states•size = 29*256=7,424 bytes
32
Building a superset protomatcher
1. Let S be a traditional signature2. Trim S into a superset signature (e.g., “cmd.exe” into
“cmd.”) obtaining S’3. Expand S to conform to the protocol specification,
obtaining S’’4. Expand S’’ to account for all possible encodings, obtaining
S’’’5. Perform 1-3 for every traditional signature in your
database, obtaining S1’’’, S2’’’,…,Sn’’’6. Build the protomatcher: an FSM that identifies S1’’’S2’’’,
…,Sn’’’
33
Superset Protomatching
GET <…>/%43MD.exe HTTP/1.1\n
Protocol analysis
Sig=CMD.EXE
Method = GETURL = <…>/%43MD.exeVersion = HTTP/1.1
Normalization
URL=CMD.EXE
Pattern matching
Malicious Benign
YesNo
GET <…>/%43MD.exe HTTP/1.1\n
Malicious Benign
YesNo
Superset Protomatcher: match a superset protomatching signature
Yes
Sig=superset protomatching signature
34
Implementation
• Implemented a compiler that converts a traditional signature into a protomatching signature
• The compiler also builds the protomatcher
• Incorporated the protomatcher into Snort
• Used traditional Snort as the second phase of a superset protomatcher
35
Two ways to implement Protomatcher
• Using a deterministic FSM. That is what we do in the examples used.
• Using a hierarchical FSM. It has two parts: a matcher and a normalizer.– The matcher is responsible for protocol analysis and
pattern matching.– The normalizer is responsible for processing multiple
encodings.– Unlike ANM which first normalizes the whole http
request, it uses the normalizer only when necessary.– Can help reduce memory needed.
36
Performance improvement
ApPPT: Average per Packet Processing Time (cycles)
37
Comparison between Protomachers memory size
38
Sensitivity to Cache Poisoning Attack
• We assumed that the attack would have a larger effect on a protomatcher-based Snort than on vanilla Snort.
• But the result contradicts the assumption. There might be two reasons for this result:– First, the attack was ineffective in increasing the
number of cache misses. It means that a more sophisticated cache poisoning attack is needed.
– Second, the attack was effective, but cache performance is only a minor component of the ApPPT.
39
Conclusion
• Optimize for the common case is a known method• In this talk we presented develop a technique that
uses this method to improve matching efficiency• Our technique is based on formal methods• These methods enable automation, therefore
efficiency, and facilitates accuracy
40
Discussion on shortcomings
• Failure due to Cache-poisoning attacks
• Converting a Protomatching signature to a superset signature should be done manually. Better methods?