PLAT-4 Understanding the SOLR Integration

Post on 26-Jun-2015

5.515 views 2 download

Tags:

description

Video that accompanies this presentation at: http://www.youtube.com/watch?v=1t3Z2pJyulA Join us for a guided tour of the Alfresco SOLR integration and new search sub-systems. We’ll discuss how it works, the limitations of eventual consistency, guidance for configuration and set-up. We’ll also cover the steps required to migrate, improved PATH performance, in-query ACL evaluation, cross-language support and monitoring as well as performance.

Transcript of PLAT-4 Understanding the SOLR Integration

Understanding The SOLR Integration Andy Hind • Senior Developer • twitter @andy_hind

Agenda

• Why SOLR? • What is supported? •  Eventual consistency • Configuration and setup •  How to migrate •  Status/reporting •  Improvements

Why SOLR?

•  Issues… o  Cluster – index per node o  Performance

•  Permission evaluation •  Structural queries •  In-transaction indexing

o  Scale query independently o  Cross-locale support o  Sub-system and dynamic configuration

What is supported?

•  Spaces store •  Archive • Query languages •  NOT

o  WCM based on AVM o  Records Management o  All stores o  Multi-tenant o  In transaction (eventually consistent)

Eventual consistency

•  SOLR is tracking Alfresco o  Following transactions – a bit like clustering o  Eventual consistency o  Transactions that may take some time to commit o  Two cores

•  SpacesStore •  ArchiveStore

Eventual consistency

• Models •  ACLs • Metadata • Content • Ownership •  Structure - PATH

High Level Architecture

Repository

Solr

Search Requests

Async: Index Polling

Solr Cores: - Workspace - Archive

Search Results

Content Store(s)

Database Storage

Solr Cores

Models ACLS Properties & Content

Updates

Setup

•  SOLR is a web app o  zip

• Communicates over SSL o  Generate and configure your certificates …

•  Per core configuration in SOLR o  Data location

•  Installer default

Configuration

•  Search sub-systems o  solr, lucene o  Change configuration without restarting Alfresco

•  JMX/Share admin •  Lucene

o  Lots – sub-set in share

•  SOLR o  Host/port/SSL

•  Properties

How to migrate

• Carry on using lucene • Configure SOLR • Configure Alfresco

o  Support SOLR tracking

• Monitor SOLR tracking •  Switch sub-systems when ready •  You can switch back to lucene

o  It will check its state as it does now at start up

Stats and reporting

•  JMX/Share o  Later ….

•  Direct to SOLR o  https://localhost:8443/solr/admin/cores?action=SUMMARY o  https://localhost:8443/solr/admin/cores?action=REPORT

•  Fix o  JMX o  https://localhost:8443/solr/admin/cores?action=FIX

Improvements

•  PATH •  Access evaluation

o  Query time

• Cross-language/locale support o  Query/Tokenisation o  Sorting

•  SOLR o  Query caching o  Facets

Improvements …

• Cross-language o  Standard tokenisation o  Configurable o  Default – SOLR WordDelimiterFilterFactory

•  BigWoof-123-A47.txt •  .txt, Big, 123A, 123a47txt, 47, A47,

BigWoof123A47txt

Improvements …

• Cross-language o  Sort

•  d:text –  en: peach péché pêche sin –  fr: peach pêche péché sin

•  d:mltext – Nearest match

Improvements …

•  Indexing Control o  cm:indexControl o  cm:isIndexed (Boolean)

•  Enable/disable All indexing (properties & content)

o  cm:isContentIndexed (Boolean) •  Enable/disable Content Indexing

Improvements …

• Canned Queries o  How is share affected by eventual consistency? o  DB o  Not lucene/SOLR

Where is SOLR/Lucene used?

•  Advanced Search •  Filters •  Tags (not the roll up) •  Categories (facets) •  Dashlets

o  E.g. Recently Modified

•  People, Groups, Sites will use DB query unless o  Start with *xyz o  Other wildcards

SOLR futures

•  SOLR cloud •  SOLR/Lucene improvements

o  Performance o  Future 3.4, 4.0, ...

• Geo

Demos ….

Questions?