Reverse Engineering of Biomedical Elaborated Signal-Introduction
Fast Multi-string Pa ern Matching using PISA · Details to achieve the compact DFA tables and the...
Transcript of Fast Multi-string Pa ern Matching using PISA · Details to achieve the compact DFA tables and the...
Fast Multi-string Pa�ern Matching using PISAShicheng Wang, Chang Liu, Ying Liu, Guanyu Li, Menghao Zhang, Yangyang Wang, Mingwei Xu
Tsinghua University
CCS CONCEPTS• Security and privacy→ Network security; • Networks→ Net-work algorithms;
KEYWORDSString Matching; Programmable Switch; PISA
ACM Reference Format:ShichengWang, Chang Liu, Ying Liu, Guanyu Li,Menghao Zhang, YangyangWang, Mingwei Xu. 2019. Fast Multi-string Pattern Matching using PISA. InThe 15th International Conference on emerging Networking EXperiments andTechnologies (CoNEXT ’19 Companion), December 9–12, 2019, Orlando, FL,USA. ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3360468.3368184
1 INTRODUCTIONMulti-string pattern matching serves as a fundamental build blockfor many network security applications, especially network intru-sion/prevention systems (IDS/IPS) [2], web application �rewalls(WAF) [8], and application identi�cation systems [6]. It typicallyexpresses certain attack signatures (rules) with multiple strings,which are then used to examine whether the payload of a packetmatches any of the prede�ned rules. Unfortunately, this often be-comes a bottleneck of these systems because every byte of thepacket has to be scanned by a large ruleset of strings.
Existing works often alleviate this bottleneck with algorithm im-provement [1] or GPU/FPGA acceleration [3, 7]. However, neitherthese software-improved nor hardware-accelerated approaches pro-vide the desired performance that catches up with the dramaticincrease of the network bandwidth and network tra�c today (e.g.,from multi-10s of Gbps to multi-100s of Gbps). We argue this per-formance gap lies in the performance capability of the hardwarefor acceleration, and the powerful programmable switch providesan unprecedented opportunity to bridge this gap. After all, theprogrammable switch is able to easily reach multi-Tbps through-put within a single box, which further reduces the per unit capitalexpense signi�cantly.
Despite this vision being promising, implementing multi-stringpattern matching on the programmable switch (i.e., Protocol Inde-pendent Switch Architecture (PISA)) is non-trivial. A recent work,PPS [4], has already demonstrated the feasibility of achieving fast
∗Supported by National Key R&D Program of China (2018YFB1800405 and2017YFB0801701) and National Science Foundation of China (No.61872426 andNo.61772307).
Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor pro�t or commercial advantage and that copies bear this notice and the full citationon the �rst page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).CoNEXT ’19 Companion, December 9–12,2019, Orlando, FL, USA© 2019 Copyright held by the owner/author(s).ACM ISBN 978-1-4503-7006-6/19/12. . . $15.00https://doi.org/10.1145/3360468.3368184
p1 ∧ p2 ∧ ¬p3 p1 ∧ p4 ∧ ¬p5
110** ⟶ R1 matched1**10 ⟶ R2 matched
……0000001* 0100000* ⟶ p1 matched
……
encodingand
transitionsharing
…
…strings
relationaloperations
R1content: "POST"content: "<NAME>"content: ! "</NAME>"
R2content: "POST"content: "<URL>"content: ! "</URL>"
Snort rules
DFA
DNF
DFA Table
Policy Table
Controller PISA Switch
converting
①
③
②
④
……
p1: "POST"p2: "<NAME>"p3: "</NAME>"p4: "<URL>"p5: "</URL>"
Figure 1: B��� architecture and work�ow.
string searching on PISA at ⇠Tbps. However, we identify two es-sential questions which remain unanswered. First, current securityapplications usually require a large set of rules (e.g., the latest com-munity ruleset of Snort has ⇠4000 rules), which also indicates thatthe translated deterministic �nite automaton (DFA) would also belarge. Simply mapping the DFA into the match action tables asPPS [4] requires an extremely large number of match action entries,which are challenging to be stored in the limited memory of the pro-grammable switch. Second, many security applications also requirecorrelated strings to express attack signatures (rules). For example,Figure 1 shows two simpli�ed snort rules (sid: 25355⇠25336) whereonly "content" �elds are saved. Rule #1 requires "POST", "<NAME>"both exist while "</NAME>" does not appear in the packet, and rule#2 requires "POST", "<URL>" both exist while "</URL>" does notappear. This requires that the programmable switch should supportrelational operations for multiple strings, which is challenging tobe e�ciently achieved on the restricted computational model ofthe programmable switch.
To address these problems, we propose B���, a fast multi-stringpattern matcher with the programmable switch. We utilize the tran-sition sharing to compress pattern entries to �t them into the limitedmemory on the programmable switch (§2.1). And we design a policytable at the end of the switch pipeline to express relational opera-tions among multiple strings (§2.2). Finally, we give a preliminaryimplementation and evaluation to demonstrate the e�ectiveness ofB��� (§3).
2 BOLT DESIGNThe overall architecture and work�ow of B��� are shown in Fig-ure 1. B��� controller extracts the strings and the relational opera-tions among strings from the given pattern rules separately, andtranslates them into di�erent match action tables. And the dataplane of B��� conducts pattern matching with DFA match actiontables and policy match action table. Details to achieve the compactDFA tables and the e�ective policy table are elaborated in §2.1 and§2.2.
2.1 Minimizing the entry numberB��� employs Aho-Corasick algorithm to construct an equivalentDFA for given strings, and then transforms the DFA into matchaction tables, similar to PPS [4]. During the mapping from the DFA
CoNEXT ’19 Companion, December 9–12,2019, Orlando, FL, USA Shicheng Wang et al.
Src State Character Dst State10 (S0)
01110000 (p)
00 (S1)00 (S1) 01 (S2)01 (S2) 01 (S2)11 (S3) 00 (S1)10 (S0) 01110100 (t) 11 (S0)
** ******** 10 (S0)
Src State Character Dst State0*
01110000 (p)01
1* 0010 01110100 (t) 11** ******** 00
S0
S1
S2
p
p
tp
p
S3
Encode
S0: 10S1: 00S2: 01S3: 11
Compress"ppt"
DFA DFA Table Compressed DFA Table
Figure 2: B��� compression procedure. Figure 3: Compression e�ciency.
to the match action entries, we observe that many transition edgeshave the same input character and destination state, and they onlydi�er in their source states. This implies that we can share sometransitions to reduce the transition number and save the switchmemory.
However, directly merging multiple transitions is not alwaysfeasible. Take the DFA in Figure 2 as an example, although the stateS0 and S3 have the same transitions with "p" to S1, they can notbe merged if we choose "00" and "11" to encode them separately.But if we encode S0 with "10" instead of "00", then the state S0and S3 can be represented by state "1*", and the transitions ("10","p", S1) and ("11", "p", S1) can be merged into a single transition("1*", "p", S1). To achieve an e�cient state encoding, we de�ne twotransitions with the same input character and the destination stateas a redundancy pair. To re-encode each state, we �rst construct acomplete graph for all states where the weight of each edge is thenumber of redundancy pairs between the edge’s two vertices. Thenwe calculate the maximum spanning tree for this complete graph,so a node in this tree is likely to have many redundancy pairs withits father/children nodes. Finally, we use codes with the same pre�xto re-encode nodes with descendent/ancestor relations. After thestate encoding, the next step is to seek out an e�ective compressingsolution to produce theminimumnumber of transitions.We observethat sharing transitions always have the same input character, sowe group re-encoded DFA match action entries according to theinput character, and then employ the classic dynamic programmingmethod in routing table compression [5] to obtain an e�ectivecompression for each group. We do not claim our scheme willget the optimal compression rate, only that they are empiricallye�ective and can get a reasonable compression result. We will seekthe optimal algorithm in future.
2.2 Correlated multi-string matchingTo achieve multi-string matching with complex relational opera-tions, our high level idea is to transform those operations in onerule into an entry (entries) in the policy table, as shown in Figure 1.First, B��� represent each rule in Boolean algebra as a Booleanexpression, in which each Boolean variable indicates whether thecorresponding string should be matched for this rule. Second, eachobtained Boolean expression is normalized to a more compact Dis-junctive Normal Form (DNF), which is a disjunction of multipleconjunctions. To accommodate these DNFs, we put them in thepolicy table whose match �elds represents the Boolean variables of
the DNFs: each entry in the policy table represents one conjunc-tion in the DNF, and the relationship between di�erent entries alsocorresponds to the disjunctive relations of di�erent conjunctions.Therefore, the DNF is �nally transformed into one entry (entries),the number of which is the same as that of conjunctions in the DNF.Since any Boolean expression can be converted to a DNF, all patternrules containing AND, OR, and NOT relations can be eventuallyconverted into entries of the policy table on the programmableswitch.
At runtime, for each packet, B��� maintains a bit vector to storethe Boolean variables, which indicates whether the correspondingstrings are matched. Once a single string is found in the payload ofthe packet, B��� sets the corresponding bit in its bit vector. Whencompleting the detection for the whole payload of the packet, B���match the policy table with the full bit vector, to see whether thepacket contains some rules. On the programmable switch, the bitvector of the packet is stored in the metadata.
3 EVALUATION AND FUTUREWORKWe implement an open-source B��� prototype with 400 lines ofP4 code for the data plane and 500 lines of Python code for thecontroller1. We select "content" patterns from snort rules and jointhem with relational operations as the original rule. As depictedin Figure 3, comparing with PPS [4], B��� reduces the number ofentries by ⇠80%, which makes it possible to �t 2000 patterns within10K match-action entries in one hardware switch. Besides, B���can function normally under correlated multiple strings in onerule, which also demonstrates the e�ectiveness of our policy table.In future, we plan to further optimize the compression algorithm,implement B��� on the hardware To�no switch and conduct moreevaluations in real-world scenarios.
REFERENCES[1] Byungkwon Choi and et al. 2016. {DFC}: Accelerating String Pattern Matching
for Network Applications. In NSDI. 551–565.[2] Cisco. 2019. Snort. https://www.snort.org/.[3] Muhammad Asim Jamshed and et al. 2012. Kargus: a highly-scalable software-
based intrusion detection system. In CCS. ACM, 317–328.[4] Theo Jepsen and et al. 2019. Fast String Searching on PISA. In SOSR.[5] Alex X Liu and et al. 2010. TCAM Razor: A systematic approach towards minimiz-
ing packet classi�ers in TCAMs. TON 18 (2010).[6] ntop. 2019. nDPI. https://www.ntop.org.[7] Reetinder Sidhu and Viktor K Prasanna. 2001. Fast regular expression matching
using FPGAs. In FCCM. IEEE, 227–238.[8] Trustwave. 2019. ModSecurity. https://modsecurity.org/.1https://github.com/LamperougeWang/P4PatternMatch