JahiaOne 2015 - Search at scale: free Digital Factory from running searches!

Post on 31-Jul-2015

54 views 1 download

Transcript of JahiaOne 2015 - Search at scale: free Digital Factory from running searches!

© 2002 - 2015 Jahia Solutions Group SA

Search at scaleFree Digital Factory from running searches!

Christophe Laprun / Kevan JahanshahiJahia Solutions Group SA

© 2002 - 2015 Jahia Solutions Group SA

Current architecture Searching most easily accomplished using

JSP taglib Uses SearchService Delegates to SearchProvider JahiaJCRSearchProvider relies on Lucene

to perform search at the JCR level

© 2002 - 2015 Jahia Solutions Group SA

Limitations Search contends for JCR access with

"regular" operations Difficult to scale Difficult to support some use cases using

"pure" Lucene

© 2002 - 2015 Jahia Solutions Group SA

Idea: externalize search Free Digital Factory from running searches

so that it can focus on its core responsibilities

© 2002 - 2015 Jahia Solutions Group SA

Requirements Shouldn't impact Digital Factory core As transparent as possible to end users Functionally equivalent to internal search Support horizontal scaling Should support extended search use cases

for future improvements

© 2002 - 2015 Jahia Solutions Group SA

Digital Factory infrastructure

Added support for multiple SearchProvider implementations

You can add your own: one interface, four methods to implement (two, really)

Automatically found and registered by Digital Factory is available in Spring context

Server settings provided to select implementation

© 2002 - 2015 Jahia Solutions Group SA

Which search engine? Elasticsearch for several reasons:

Open source Built-in distribution / clustering Better support for our use cases Easier to get started with Nicer documentation Easier to work with (RESTful API / JSON)

© 2002 - 2015 Jahia Solutions Group SA

Architecture Digital Factory module SearchProvider implementation Embeds Elasticsearch but can connect to a

running server Deploys JCR event listeners to automatically

index data changes as they happen

© 2002 - 2015 Jahia Solutions Group SA

Indexing Only defined node types are indexed Only string, date and j:extractedText

properties are indexed Possible to exclude properties

© 2002 - 2015 Jahia Solutions Group SA

More indexing Automatically leverages node type

definitions to create Elasticsearch mappings I18n properties => language-specific analyzer Non-text searchable => not analyzed IndexType.NO => not indexed Boost properly supported

© 2002 - 2015 Jahia Solutions Group SA

Installation / configuration

Deploy module SearchProvider selection Elasticsearch configuration:

Cluster name Indexed node types Excluded properties

© 2002 - 2015 Jahia Solutions Group SA

A quick look at the data Using elasticsearch-head (

http://mobz.github.io/elasticsearch-head) Requires CORS support (automatically

activated in Dev mode by implementation)

© 2002 - 2015 Jahia Solutions Group SA

Joining an existing cluster

Start a separate Elasticsearch instance Change cluster name in settings to join

separate instance cluster Check indices

© 2002 - 2015 Jahia Solutions Group SA

Benefits Transparent to users as it uses the same

search infrastructure / UI Decouples searches from "normal" Digital

Factory operations Transparent scalability using Elasticsearch

clustering

© 2002 - 2015 Jahia Solutions Group SA

Current state Initial implementation to validate concept

Indexer Listeners

Available as an Enterprise Edition add-on: talk to your account manager!

© 2002 - 2015 Jahia Solutions Group SA

Current limitations Only one SearchProvider per server Still requires to convert results to JCR

nodes, in particular to ensure proper permissions

© 2002 - 2015 Jahia Solutions Group SA

Future Better indexing to improve search performance Avoid going back to JCR nodes to create a

search hit to further decouple from Digital Factory core

Better access to Elasticsearch More powerful search mechanisms

© 2002 - 2015 Jahia Solutions Group SA

Questions?