Post on 16-Jul-2015
Search can be smarter.
location search history query security context
Personal, contextual, relevant results: consumer-like simplicity and power in the enterprise.
Product Offering
Environment
Features
Support Level
Additional Support
AvailabilityResponse Time
Number of IncidentsPricing Model
SolrEnterprise
24x7SLA-Backed
Unlimited IncidentsPer Node
Dev Support (4 Contacts)Operational Support
Regular Health Checks
SecurityLog Analysis / SiLK Support
Dashboards & ReportingEnhanced Admin UI
Fusion
Dev Support (4 Contacts)Operational Support
Regular Health Checks
24x7SLA-Backed
Unlimited IncidentsPer Node
SecurityCrawlers & Connectors
Log Analysis / SiLK SupportEnhanced Admin UI
Data EnrichmentMachine LearningRecommendations
Advanced Relevancy Tuning
DeveloperSupport
How-To SupportKnowledge BaseFusion Support
9x5SLA-Backed
Unlimited IncidentsPer Named Developer
ProductionDevelopment
• Easy to start/stop
./bin/solr {start|stop}
• Create collections:
./bin/solr create -c <COLL_NAME>
• No more WAR! Web container (Jetty) is now an implementation detail
• Scripts to support installing and running Solr as a service on Linux.
Get Started
JSON’s great:
• Solr 5 “does the right thing” for JSON out of the box
Except when it isn’t:
• Most data isn’t JSON
• Solr handles CSV, XML, Rich Content out of the box without having to install plugins
Your Content, Your Way
Your Content, Your Way
• Solr 5 will ship Tika 1.7, adding:
• OCR support
• PST and Matlab
• Better Date Handling
• More flexibility with spatial units
• Stats and Pivot faceting now work together
• Focused on accuracy of results
• First few steps in unification of all facet types with stats and aggregations
• http://lucidworks.com/blog/you-got-stats-in-my-facets/
Pivots and Stats
• Schema API: REST API for adding field types, and dynamic fields
• Managing Request Handlers through API
• Implicit registration of replication, Real Time Get and Administration Handlers
• Improved APIs for managing collections
API Goodness
Lucene 5 Highlights
• Stronger index safety guarantees
• Reduced memory usage in a number of areas
• No more FieldCache (replaced w/ UninvertingReader)
• Multi-valued sorting and suggesters
• Better IO defaults when using SSDs
• More efficient handling of merging stored fields
Go Big
• Many scaling improvements focused on interactions with Zookeeper:
• Split cluster state management reduces chattiness in large multi-tenant implementations
• Improved performance for Overseer operations >40%
• Better timeout defaults based on real-world testing
• See my Lucene Revolution Keynote for more details: http://bit.ly/shalinRevKeynote
Distributed IDF
• IDF = Inverse Document Frequency = A measure of the relative importance of a word in a collection
• 4 implementations:
• LocalStatsCache: Local Stats
• ExactStatsCache: One time use aggregation
• ExactSharedStatsCache: Stats shared across requests
• LRUStatsCache: Stats shared in an LRU cache across requests
• Ease of getting started means nothing if you can’t stay running in production
• Jepsen tests simulate network partitions, data loss, i.e. “The Real World”
• https://github.com/LucidWorks/jepsen/tree/solr-jepsen
• http://bit.ly/solr-jepsen
Get Finished
Stability Improvements
• Protection of ZK content
• ReplicationHandler now has an option to throttle the speed of replication
• More control over terminating long running queries
• Finite default timeouts for select and update requests
• Facets and Analytics:
• Mix and match all facet types and stats (SOLR-6352, SOLR-6353, SOLR-4212)
• Percentiles via t-digest (SOLR-6350)
• Replication performance (SOLR-6816)
• Finish off Config APIs (various)
• Data location aware ValueSource implementation for fast changing distributed data
• First class support for more languages OOTB
Near Term Road Map
Resources
Release Notes: • Solr: http://wiki.apache.org/solr/ReleaseNote50 • Lucene: https://wiki.apache.org/lucene-java/
ReleaseNote50 Lucidworks: http://www.lucidworks.com Shalin Shekhar Mangar
• shalin@apache.org • Twitter: https://twitter.com/shalinmangar
Credits
What’s new in Solr 5.0 — Anshum Gupta • http://www.slideshare.net/anshumg/solr-50
Lucidworks webinar “Inside Solr 5” - Grant Ingersoll • http://www.slideshare.net/lucidworks/webinar-inside-
apache-solr-5