Post on 31-Jul-2015
© 2002 - 2015 Jahia Solutions Group SA
Search at scaleFree Digital Factory from running searches!
Christophe Laprun / Kevan JahanshahiJahia Solutions Group SA
© 2002 - 2015 Jahia Solutions Group SA
Current architecture Searching most easily accomplished using
JSP taglib Uses SearchService Delegates to SearchProvider JahiaJCRSearchProvider relies on Lucene
to perform search at the JCR level
© 2002 - 2015 Jahia Solutions Group SA
Limitations Search contends for JCR access with
"regular" operations Difficult to scale Difficult to support some use cases using
"pure" Lucene
© 2002 - 2015 Jahia Solutions Group SA
Idea: externalize search Free Digital Factory from running searches
so that it can focus on its core responsibilities
© 2002 - 2015 Jahia Solutions Group SA
Requirements Shouldn't impact Digital Factory core As transparent as possible to end users Functionally equivalent to internal search Support horizontal scaling Should support extended search use cases
for future improvements
© 2002 - 2015 Jahia Solutions Group SA
Digital Factory infrastructure
Added support for multiple SearchProvider implementations
You can add your own: one interface, four methods to implement (two, really)
Automatically found and registered by Digital Factory is available in Spring context
Server settings provided to select implementation
© 2002 - 2015 Jahia Solutions Group SA
Which search engine? Elasticsearch for several reasons:
Open source Built-in distribution / clustering Better support for our use cases Easier to get started with Nicer documentation Easier to work with (RESTful API / JSON)
© 2002 - 2015 Jahia Solutions Group SA
Architecture Digital Factory module SearchProvider implementation Embeds Elasticsearch but can connect to a
running server Deploys JCR event listeners to automatically
index data changes as they happen
© 2002 - 2015 Jahia Solutions Group SA
Indexing Only defined node types are indexed Only string, date and j:extractedText
properties are indexed Possible to exclude properties
© 2002 - 2015 Jahia Solutions Group SA
More indexing Automatically leverages node type
definitions to create Elasticsearch mappings I18n properties => language-specific analyzer Non-text searchable => not analyzed IndexType.NO => not indexed Boost properly supported
© 2002 - 2015 Jahia Solutions Group SA
Installation / configuration
Deploy module SearchProvider selection Elasticsearch configuration:
Cluster name Indexed node types Excluded properties
© 2002 - 2015 Jahia Solutions Group SA
A quick look at the data Using elasticsearch-head (
http://mobz.github.io/elasticsearch-head) Requires CORS support (automatically
activated in Dev mode by implementation)
© 2002 - 2015 Jahia Solutions Group SA
Joining an existing cluster
Start a separate Elasticsearch instance Change cluster name in settings to join
separate instance cluster Check indices
© 2002 - 2015 Jahia Solutions Group SA
Benefits Transparent to users as it uses the same
search infrastructure / UI Decouples searches from "normal" Digital
Factory operations Transparent scalability using Elasticsearch
clustering
© 2002 - 2015 Jahia Solutions Group SA
Current state Initial implementation to validate concept
Indexer Listeners
Available as an Enterprise Edition add-on: talk to your account manager!
© 2002 - 2015 Jahia Solutions Group SA
Current limitations Only one SearchProvider per server Still requires to convert results to JCR
nodes, in particular to ensure proper permissions
© 2002 - 2015 Jahia Solutions Group SA
Future Better indexing to improve search performance Avoid going back to JCR nodes to create a
search hit to further decouple from Digital Factory core
Better access to Elasticsearch More powerful search mechanisms
© 2002 - 2015 Jahia Solutions Group SA
Questions?