Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to...
Transcript of Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to...
![Page 1: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/1.jpg)
www.exegy.com
Exploiting Reconfigurabilityfor Text Search
Roger D. Chamberlain, Mark A. Franklin, and Ron S. IndeckExegy Inc.
![Page 2: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/2.jpg)
Exegy TextMiner
Highly Optimized Data Pipeline from Input thru Output
Specialized Processing in Close
Proximity to Data
1-7 TB fast RAID;RAM / FPGA contiguous
![Page 3: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/3.jpg)
Specialized Processing on Custom Board
FPGA accelerated custom board
• Permits massively parallel operations
• Offloads work from CPU
• Integrates with other system components enabling high-speed data ingress and egress
• Designed with common APIs to give user control of functionality
• Draws from a library of pre-defined modules used to perform certain operations
• New functional modules readily incorporated
Analogous to graphic accelerator cards
![Page 4: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/4.jpg)
Exegy A2000 Appliance
processordiskcontroller
diskdata
toprocessor
configurationsubsystem
reconfigurablelogic
network
![Page 5: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/5.jpg)
TextMiner Application
• Searching through an unindexed text corpus for items of interest
• Example query(Cardinals NEAR[200] Baseball) AND
(Manchester NEAR[200] Soccer)“Cardinals” within 200 characters of “Baseball” and“Manchester” within 200 characters of “Soccer”
• Supported combining operators includeBoolean: AND, OR, NOTProximity: NEAR, ANDTHEN
![Page 6: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/6.jpg)
Benefits of reconfiguration
formulateinitialquery
![Page 7: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/7.jpg)
Benefits of reconfiguration
formulateinitialquery
![Page 8: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/8.jpg)
Benefits of reconfiguration
formulateinitialquery
analyzequeryresults
![Page 9: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/9.jpg)
Benefits of reconfiguration
formulaterevisedquery
analyzequeryresults
![Page 10: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/10.jpg)
Query Options
Exact SearchLiteral keywords must match exactlyTens of thousands of keywords searched in one pass across the data
Approximate SearchWildcard charactersCase insensitivityCharacter substitution up to specified bound
Regular Expression SearchFull expressive power of finite-state machine recognizer
![Page 11: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/11.jpg)
Exact Match Engine
StartupHash keywords to a bit vector in FPGARabin-Karp hash functions
RunStream text corpus from disk or network to FPGAHash text to bit vector positionCheck position for keyword hit
CheckFalse positives from hash collisions checked in software
![Page 12: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/12.jpg)
Approximate Match Engine
• Data shift register receives inbound text
• Compared with keyword at character level
• Count of matching characters is checked with threshold
• If character matches exceed threshold, keyword is a match
h o r s e
h o u s e
= = = =
compareregister
fine-graincomparison
data shift register
count (4)
inputdata
word-levelcomparison
> threshold?
match signal
≠
![Page 13: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/13.jpg)
Regular Expression Engine
symbol encoding addr.logic
stateselection
logic
currentstate
regular expression compiler
indi
rect
ion
tabl
e
trans
ition
tabl
e
inpu
tda
ta
match signal
![Page 14: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/14.jpg)
Regular Expression Engine
• Multi-character strings are combined into single symbol for finite state machine recognizer
symbol encoding addr.logic
stateselection
logic
currentstate
regular expression compiler
indi
rect
ion
tabl
e
trans
ition
tabl
e
inpu
tda
ta
match signal
![Page 15: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/15.jpg)
Regular Expression Engine
• Multi-character strings are combined into single symbol for finite state machine recognizer
• State dependent transitions are deferred to end of pipeline
symbol encoding addr.logic
stateselection
logic
currentstate
regular expression compiler
indi
rect
ion
tabl
e
trans
ition
tabl
e
inpu
tda
ta
match signal
![Page 16: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/16.jpg)
Combining Operations
• Combining operations implemented in software• Based on keyword hits from FPGA
NEAR NEAR
AND
Cardinals SoccerManchesterBaseball
(Cardinals NEAR Baseball) AND (Manchester NEAR Soccer)
![Page 17: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/17.jpg)
Summary of 3 Hardware Search Engines
• Searching for individual terms, combining operations performed in software
• Three distinct engines supported:Exact match
Thousands of terms, 800 MB/s search rateApproximate match
Can trade off # of terms vs. characters per term, 800 MB/s search rate
Regular expression searchCapable of ~50 expressions, 400 MB/s search rate
• Data source(s) can be local or remote
![Page 18: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/18.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Managing FPGA Configuration
![Page 19: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/19.jpg)
Manage Configuration StoreOn-board non-volatile storage for configurationsSupports multiple configuration files
Manage DirectoryMeta-data for configurations in configuration store
Load FPGA as instructedReconfigure FPGA from specified config file currently in configuration store
ProtectionBlock data path during reconfigurationCheck configuration is appropriate for that FPGA
Supervisor Functions
![Page 20: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/20.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Software Options: Insert
Place a configuration in the on-board store
![Page 21: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/21.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Software Options: Insert
Place a configuration in the on-board store
![Page 22: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/22.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Read Directory
Query current directory contents
![Page 23: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/23.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Read Directory
Query current directory contents
![Page 24: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/24.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Read Configuration
Primarily for verification and debugging purposes
![Page 25: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/25.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Read Configuration
Primarily for verification and debugging purposes
![Page 26: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/26.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Load
Reconfigure FPGA
![Page 27: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/27.jpg)
ConfigurationFiles
User Data
SupervisorCPLD
ConfigurationStore Directory Application
FPGA(s)
Results
Load
Reconfigure FPGA
![Page 28: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/28.jpg)
Back to Text Search Application
Sequence of events:1. User provides query and initiates search2. Examining search terms, software selects
appropriate engine3. Load configuration in FPGA, concurrently
queue up data from source4. Load search terms into engine5. Stream data through engine6. Process hits that return, performing
combining operations7. Return results to user
![Page 29: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/29.jpg)
Comments
Benefits• Application software chooses appropriate
FPGA engine• Engine is tailored to problem at handConcerns• Heterogeneous query
Requires multiple engines or multiple data passes• Configuration overhead
20 ms is longer than we would likeHowever, it’s not out of line with startup times required for disk access
![Page 30: Exploiting Reconfigurability for Text Search€¦ · Stream text corpus from disk or network to FPGA Hash text to bit vector position ... • Multi-character strings are combined](https://reader033.fdocuments.us/reader033/viewer/2022050415/5f8b6639765d4523255f3268/html5/thumbnails/30.jpg)
Summary
• Exegy A2000 appliance supports dynamic reconfiguration of application FPGAs
• Exegy TextMiner application exploits dynamic reconfiguration for text search
• 3 distinct search engines: exact, approximate, and regular expression
• FPGA configuration is concurrent with initial data reads to mask latency
• Result is a true exploitation of the physical ability to reconfigure FPGAs on the fly