Benchmarking Solr Performance at Scale

• Lucene/Solr committer. Work for Lucidworks; focus on hardening SolrCloud, devops, big data architecture / deployments

• Operated smallish cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs)

• Solr Scale Toolkit: Fabric/boto framework for deploying and managing clusters in EC2

• Co-author of Solr In Action with Trey Grainger

About Me

Agenda

1. Quick review of the SolrCloud architecture

2. Indexing & Query performance tests

3. Solr Scale Toolkit (quick overview)

4. Q & A

Solr in the wild …

https://twitter.com/bretthoerner/status/476830302430437376

SolrCloud distilled

Subset of optional features in Solr to enable and

simplify horizontal scaling a search index using

sharding and replication.

Goals

performance, scalability, high-availability,

simplicity, elasticity, and

community-driven!

Collection == distributed index

A collection is a distributed index defined by:

• named configuration stored in ZooKeeper

• number of shards: documents are distributed across N partitions of the index

• document routing strategy: how documents get assigned to shards

• replication factor: how many copies of each document in the collection

Collections API:

curl "http://localhost:8983/solr/admin/collections?

action=CREATE&name=logstash4solr&replicationFactor=2&

numShards=2&collection.configName=logs"

SolrCloud High-level Architecture

ZooKeeper

• Is a very good thing ... clusters are a zoo!

• Centralized configuration management

• Cluster state management

• Leader election (shard leader and overseer)

• Overseer distributed work queue

• Live Nodes

• Ephemeral znodes used to signal a server is gone

• Needs at least 3 nodes for quorum in production

ZooKeeper: State Management

• Keep track of live nodes /live_nodes znode

• ephemeral nodes

• ZooKeeper client timeout

• Collection metadata and replica state in /clusterstate.json

• Every Solr node has watchers for /live_nodes and /clusterstate.json

• Leader election

• ZooKeeper sequence number on ephemeral znodes

Scalability Highlights

• No split-brain problems (b/c of ZooKeeper)

• All nodes in cluster perform indexing and execute queries; no master node

• Distributed indexing: No SPoF, high throughput via direct updates to leaders, automated failover to new leader

• Distributed queries: Add replicas to scale-out qps; parallelize complex query computations; fault-tolerance

• Indexing / queries continue so long as there is 1 healthy replica per shard

Cluster sizing

How many servers do I need to index X docs?

... shards ... ?

... replicas ... ?

I need N queries per second over M docs, how many servers do I need?

It depends!

Testing Methodology

• Transparent repeatable results

• Ideally hoping for something owned by the community

• Synthetic docs ~ 1K each on disk, mix of field types

• Data set created using code borrowed from PigMix

• English text fields generated using a Zipfian distribution

• Java 1.7u67, Amazon Linux, r3.2xlarge nodes

• enhanced networking enabled, placement group, same AZ

• Stock Solr (cloud) 4.10

• Using custom GC tuning parameters and auto-commit settings

• Use Elastic MapReduce to generate indexing load

• As many nodes as I need to drive Solr!

Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec

10 10 1 48 1762 73,780

10 10 2 34 3727 34,881

10 20 1 48 1282 101,404

10 20 2 34 3207 40,536

10 30 1 72 1070 121,495

10 30 2 60 3159 41,152

15 15 1 60 1106 117,541

15 15 2 42 2465 52,738

15 30 1 60 827 157,195

15 30 2 42 2129 61,062

Indexing Performance

Visualize Server Performance

Direct Updates to Leaders

Replication

Indexing Performance Lessons

• Solr has no built-in throttling support – will accept work until it falls over; need to build this into your indexing application logic

• Oversharding helps parallelize indexing work and gives you an easy way to add more hardware to your cluster

• GC tuning is critical (more below)

• Auto-hard commit to keep transaction logs manageable

• Auto soft-commit to see docs as they are indexed

• Replication is expensive! (more work needed here)

GC Tuning

• Stop-the-world GC pauses can lead to ZooKeeper session expiration (which is bad)

• More JVMs with smaller heap sizes are better! (12-16GB max per JVM ~ less if you can)

• MMapDirectory relies on sufficient memory available to the OS cache (off-heap)

• GC activity during Solr indexing is stable and generally doesn’t cause any stop-the-world collections … queries are a different story

• Enable verbose GC logging (even in prod) so you can troubleshoot issues:-verbose:gc –Xloggc:gc.log -XX:+PrintHeapAtGC -XX:+PrintGCDetails \-XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps \-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime \-XX:+PrintGCApplicationConcurrentTime

GC Flags I use with Solr-Xss256k \

-XX:+UseConcMarkSweepGC -XX:+UseParNewGC \-XX:MaxTenuringThreshold=8 -XX:NewRatio=3 \

-XX:CMSInitiatingOccupancyFraction=40 \-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \

-XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 \-XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=12m \

-XX:CMSFullGCsBeforeCompaction=1 \-XX:+UseCMSInitiatingOccupancyOnly \

-XX:CMSTriggerPermRatio=80 \-XX:CMSMaxAbortablePrecleanTime=6000 \

-XX:+CMSParallelRemarkEnabled \-XX:+ParallelRefProcEnabled \

-XX:+UseLargePages -XX:+AggressiveOpts

Sizing GC Spaces

http://kumarsoablog.blogspot.com/2013/02/jvm-parameter-survivorratio_7.html



Query Performance

• Still a work in progress!

• Sustained QPS & Execution time of 99th Percentile (coda hale metrics is good for this)

• Stable: ~5,000 QPS / 99th at 300ms while indexing ~10,000 docs / sec

• Using the TermsComponent to build queries based on the terms in each field.

• Harder to accurately simulate user queries over synthetic data

• Need mix of faceting, paging, sorting, grouping, boolean clauses, range queries, boosting, filters (some cached, some not), etc ...

• Does the randomness in your test queries model (expected) user behavior?

• Start with one server (1 shard) to determine baseline query performance.

• Look for inefficiencies in your schema and other config settings

Query Performance, cont.

• Higher risk of full GC pauses (facets, filters, sorting)

• Use optimized data structures (DocValues) for facet / sort fields, Trie-based numeric fields for range queries, facet.method=enum for low cardinality fields

• Check sizing of caches, esp. filterCache in solrconfig.xml

• Add more replicas; load-balance; Solr can set HTTP headers to work with caching proxies like Squid

• -Dhttp.maxConnections=## (default = 5, increase to accommodate more threads sending queries)

• Avoid increasing ZooKeeper client timeout ~ 15000 (15 seconds) is about right

• Don’t just keep throwing more memory at Java! –Xmx128G

Call me maybe - Jepsen

• Solr tests being developed by Lucene/Solr committer Shalin Mangar (@shalinmanger)

• Prototype in place:

• No ack’d writes were lost!

• No un-ack’d writes succeeded

See: https://github.com/LucidWorks/jepsen/tree/solr-jepsen

https://github.com/aphyr/jepsen

Solr Scale Toolkit

• Open source: https://github.com/LucidWorks/solr-scale-tk

• Fabric (Python) toolset for deploying and managing SolrCloud clusters in the cloud

• Code to support benchmark tests (Pig script for data generation / indexing, JMeter samplers)

• EC2 for now, more cloud providers coming soon via Apache libcloud

• Contributors welcome!

• More info: http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/

https://github.com/LucidWorks/solr-scale-tk


http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/

http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/

Provisioning cluster nodes

• Custom built AMI (one for PV instances and one for HVM instances) – Amazon Linux

• Block device mapping

• dedicated disk per Solr node

• Launch and then poll status until they are live

• verify SSH connectivity

• Tag each instance with a cluster ID and username

fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge

Deploy ZooKeeper ensemble

• Two options:

• provision 1 to N nodes when you launch Solr cluster

• use existing named ensemble

• Fabric command simply creates the myid files and zoo.cfg file for the ensemble

• and some cron scripts for managing snapshots

• Basic health checking of ZooKeeper status:

echo srvr | nc localhost 2181

fab new_zk_ensemble:zk1,n=3

Deploy SolrCloud clusterfab new_solrcloud:test1,zk=zk1,nodesPerHost=2

• Uses bin/solr in Solr 4.10 to control Solr nodes

• Set system props: jetty.port, host, zkHost, JVM opts

• One or more Solr nodes per machine

• JVM mem opts dependent on instance type and # of Solr nodes per instance

• Optionally configure log4j.properties to append messages to Rabbitmq for SiLK integration

Automate day-to-day cluster management tasks

• Deploy a configuration directory to ZooKeeper

• Create a new collection

• Attach a local JConsole/VisualVM to a remote JVM

• Rolling restart (with Overseer awareness)

• Build Solr locally and patch remote

• Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network

• Put/get files

• Grep over all log files (across the cluster)

Wrap-up and Q & A

• LucidWorks: http://www.lucidworks.com -- We’re hiring!

• Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk

• SiLK: http://www.lucidworks.com/lucidworks-silk/

• Solr In Action: http://www.manning.com/grainger/

• Connect: @thelabdude / [email protected]

http://www.lucidworks.com/


http://www.lucidworks.com/lucidworks-silk/

http://www.manning.com/grainger/

mailto:[email protected]

Benchmarking Solr Performance at Scale

Data & Analytics

Transcript of Benchmarking Solr Performance at Scale