How to Gain Greater Business Intelligence from Lucene/Solr
-
Upload
lucenerevolution -
Category
Technology
-
view
2.622 -
download
2
description
Transcript of How to Gain Greater Business Intelligence from Lucene/Solr
![Page 1: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/1.jpg)
Patrick BeaucampFounder of the Vanilla Project
Mail : [email protected]
How to Gain Greater Business Intelligence with Vanilla from Solr/Lucene
1LuceneRevolution, Boston
![Page 2: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/2.jpg)
Presentation AgendaVanilla powered by Lucene- Report Indexation, Search Interface- External document management- evolution & constraints
Step to Solr/Lucene Adoption- Indexation, Storage, Search- Embeded Solr/Lucene- External Solr/Lucene Platform
Keys Benefit for Vanilla powered by Solr/Lucene- Cluster Architecture- Cache Mechanism- Support for enhanced search language
2LuceneRevolution, Boston
![Page 3: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/3.jpg)
Flash maps and charts : Reports, Cubes and Dashboard
Vanilla Apps : Android and Iphone
Some Vanilla Features
3LuceneRevolution, Boston
![Page 4: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/4.jpg)
Vanilla Powered by Lucene (1/6)Vanilla is a full Business Intelligence Platform that provide :- Reporting, Olap, Dashboard, Kpi, Maps Visualisation- Etl, Workflow, Document Management search Engine
4LuceneRevolution, Boston
![Page 5: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/5.jpg)
Vanilla Powered by Lucene (2/6)Report Indexation- Search engine is Apache Lucene (summer 2010)- External Document & Vanilla Report are indexed- Different Indexation strategy for documents :
– No indexation– Real Time indexation– Late Indexation
2 modules to manage indexation strategy - Enterprise Services to set document property- Norparena to Manage Indexation
5LuceneRevolution, Boston
![Page 6: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/6.jpg)
Vanilla Powered by Lucene (3/6)Search Interface- Search Interface available from Vanilla Portal- Search against Lucene index (inside Vanilla)- Search result is combined with Security on documents
– List contains all documents– Documents are ordered based on popularity
6LuceneRevolution, Boston
![Page 7: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/7.jpg)
Vanilla Powered by Lucene (4/6)External document management- various document format are available (Lucene)- additional properties can be set on documents, for later useage in search criteria- check In / check Out on document for versioning- search is run on the latest document version
7LuceneRevolution, Boston
![Page 8: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/8.jpg)
Vanilla Powered by Lucene (5/6)Evolution and constraints- No clustering available for search engine (embeded Api), as opposed to Vanilla Report Services- Limitation in language and keywords (internal search)- No cache to manage search resultset, as opposed to Vanilla dataset, powered by Memcached
- request from customers to be compliant with enterprise search engine → need to setup an external search architecture
8LuceneRevolution, Boston
![Page 9: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/9.jpg)
Vanilla Powered by Lucene (6/6)
9LuceneRevolution, Boston
Embeded Lucene Api inside Vanilla Platform - Video
![Page 10: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/10.jpg)
10LuceneRevolution, Boston
Step to Solr/Lucene Adoption (1/9)Solr/Lucene is the natural evolution of any embeded Lucene platform
Solr Version : 3.5
IndexationVanilla Lucene Index can be transfert & read by a Solr/Lucene(a Solr/Lucene index is not usable inside Vanilla Platform)
StorageVanilla search Indexed can be managed by a Solr/Lucene platform
SearchSearch language is compliant
![Page 11: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/11.jpg)
11LuceneRevolution, Boston
Step to Solr/Lucene Adoption (2/9)Embeded Solr/Lucene inside Vanilla Platform
No need for any changed in Vanilla code : use of solrj Api
Immediatly provide additional features such as new Keywords
Potential upgrade to Solr/Lucene Enterprise
![Page 12: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/12.jpg)
12LuceneRevolution, Boston
Step to Solr/Lucene Adoption (3/9)From Embeded Lucene to Embeded Solr/Lucene inside Vanilla Platform
![Page 13: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/13.jpg)
13LuceneRevolution, Boston
Step to Solr/Lucene Adoption (4/9)Embeded Solr/Lucene inside Vanilla Platform - Video
![Page 14: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/14.jpg)
14LuceneRevolution, Boston
Step to Solr/Lucene Adoption (5/9)Solr/Lucene Platform with a Vanilla Platform
Need for changes in Vanilla code, to separate document management, indexation & search Api → 10 man days workload
Document Management ApiEasy to move to any Cmis compliancy
Indexation & Search ApiSolr/Lucene oriented & compliant, but now open to any other Search Platform
![Page 15: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/15.jpg)
15LuceneRevolution, Boston
Step to Solr/Lucene Adoption (6/9)Coding Before
Example of Code (Api) Before the split
- Direct use of the Lucene Api
- Parse the document content using Apache TIKA
- Generate Lucene's queries
![Page 16: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/16.jpg)
16LuceneRevolution, Boston
Step to Solr/Lucene Adoption (7/9)Coding After
Example of Code (Api) After the split
- Easy to use Solrj Api
- Distributed search
- Indexation with automatic parsing (using Apache Tika)
![Page 17: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/17.jpg)
17LuceneRevolution, Boston
Step to Solr/Lucene Adoption (8/9)Solr/Lucene Platform with Vanilla Platform - Screenshot
![Page 18: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/18.jpg)
18LuceneRevolution, Boston
Step to Solr/Lucene Adoption (9/9)Solr/Lucene Platform with Vanilla Platform - Video
![Page 19: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/19.jpg)
19LuceneRevolution, Boston
Key Benefits for Vanilla Powered by Solr/Lucene (1/4)
Clustering Search Architecture, outside of Vanilla
Search results clustering implementation (CarrotClusteringEngine) is based on the Carrot2 framework.
![Page 20: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/20.jpg)
20LuceneRevolution, Boston
Key Benefits for Vanilla Powered by Solr/Lucene (2/4)
Additional query language to perform search
Solr Uses the Lucene Search Library and Extends it!
- A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys- Powerful Extensions to the Lucene Query Language- Faceted Search and Filtering- Geospatial Search- Advanced, Configurable Text Analysis
![Page 21: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/21.jpg)
21LuceneRevolution, Boston
Key Benefits for Vanilla Powered by Solr/Lucene (3/4)
New methods to manage result set (binary, Xml, Json)
Solr enterprise search server with a REST-like API. You put documents in it (called "indexing") via
XML, JSON or binary over HTTP. You query it via HTTP GET
and receive XML, JSON, or binary results
- Advanced Full-Text Search Capabilities- Optimized for High Volume Web Traffic- Standards Based Open Interfaces - XML,JSON and HTTP
![Page 22: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/22.jpg)
22LuceneRevolution, Boston
Key Benefits for Vanilla Powered by Solr/Lucene (4/4)
Cache Mechanism
Solr caches are associated with an Index Searcher
Three cache implementations : solr.LRUCache (LRU = Least Recently Used in memory),solr.FastLRUCache, solr.LFUCache (Least Frequenty Used)
Many configuration parameters for cache optimisation
![Page 23: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/23.jpg)
23LuceneRevolution, Boston
Next StepsUpgrade to Solr 4.0
New features for Document cycle Management
Roadmap for better Internationalisation :- 10 languages available (not Japaneese)- Search Translation management
![Page 24: How to Gain Greater Business Intelligence from Lucene/Solr](https://reader033.fdocuments.us/reader033/viewer/2022051817/54824c6eb4af9f3b668b456b/html5/thumbnails/24.jpg)
Documentations and tutorials available on our Web sites:
www.bpm-conseil.com and forge.bpm-conseil.com
Thanks for your attention
24LuceneRevolution, Boston