Implementing a custom search syntax using solr, lucene & parboiled

Post on 04-Jul-2015

4.607 views 5 download

description

Presented by John Berryman, Search Architect, Opensource Connections In a recent project with the US Patent and Trademark Office, Opensource Connections was asked to prototype the next generation of patent search - using Solr and Lucene. An important aspect of this project was the implementation of BRS, a specialize search syntax used by patent examiners during the examination process. In this fast paced session we will relate our experiences and describe how we used a combination of Parboiled (a Parser Expression Grammar [PEG] parser), Lucene Queries and SpanQueries, and an extension of Solr's QParserPlugin to build BRS search functionality in Solr. First we will characterize the patent search problem and then define the BRS syntax itself. We will then introduce the Parboiled parser and discuss various considerations that one must make when designing a syntax parser. Following this we will describe the methodology used to implement the search functionality in Lucene/Solr. Finally, we will include an overview our syntactic and semantic testing strategies. The audience will leave this session with an understanding of how Solr, Lucene, and Parboiled may be used to implement their own custom search parser.

Transcript of Implementing a custom search syntax using solr, lucene & parboiled