Administering and Monitoring SolrCloud Clusters

37
Administering and Monitoring SolrCloud Rafał Kuć – Sematext Group, Inc. @kucrafal @sematext sematext.com

description

37 slides about taking care of your SolrCluster - Collections API, Core API, dynamic schema modification, segment merging, hard vs. soft commit, caches, monitoring, performance, JMX, it's all in here.

Transcript of Administering and Monitoring SolrCloud Clusters

Page 1: Administering and Monitoring SolrCloud Clusters

Administering and Monitoring SolrCloud

Rafał Kuć – Sematext Group, Inc.@kucrafal @sematext sematext.com

Page 2: Administering and Monitoring SolrCloud Clusters

Ta me…

Sematext consultant & engineerSolr.pl co-founderFather and husband

Page 3: Administering and Monitoring SolrCloud Clusters

Solr Server

SolrCloud Concepts

Solr Server

Solr Server Solr Server

Shard1 Replica

Shard2 Replica

Shard2Shard1

Application

Page 4: Administering and Monitoring SolrCloud Clusters

Local SolrCloud Cluster

java -Dbootstrap_confdir=./solr/revolution/conf -Dcollection.configName=revolution -DzkRun -DnumShards=1 -jar start.jar

Runs embedded ZooKeeperBootstraps collection with 1 shardsStarts Solr

Page 5: Administering and Monitoring SolrCloud Clusters

Starting Solr Cluster

ZooKeeper ZooKeeper ZooKeeper

Solr Server Solr Server

-DzkHost=192.168.1.2:2181,192.168.1.1:2181,192.168.1.3:2181

Solr Server Solr Server

-DzkHost=192.168.1.1:2181,192.168.1.2:2181,192.168.1.3:2181

-DzkHost=192.168.1.3:2181,192.168.1.1:2181,192.168.1.2:2181

-DzkHost=192.168.1.3:2181,192.168.1.1:2181,192.168.1.2:2181

No Collection

No Collection No Collection

No Collection

Page 6: Administering and Monitoring SolrCloud Clusters

Uploading Collection Configuration

./zkcli.sh -cmd upconfig -zkhost 192.168.1.1:2181 -confdir ./conf/ -confname revolution

ZooKeeper

ZooKeeper

ZooKeeper

Collection configuration Solr

Page 7: Administering and Monitoring SolrCloud Clusters

Collections APICreate

Delete

Reload

Split

Create Alias

Delete Alias

Shard Creation/Deletionhttp://wiki.apache.org/solr/SolrCloud

Page 8: Administering and Monitoring SolrCloud Clusters

Collection Creation

curl 'http://solrhost:8983/solr/admin/collections?action=CREATE&name=revolution&numShards=3&replicationFactor=4'

name

numShards

replicationFactor

maxShardsPerNode

createNodeSet

collection.configName

Page 9: Administering and Monitoring SolrCloud Clusters

Collection Split Example

$ curl 'http://solr1:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=2&replicationFactor=1'

Page 10: Administering and Monitoring SolrCloud Clusters

Collection Split Example

$ curl 'http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=collection1&shard=shard1'

Page 11: Administering and Monitoring SolrCloud Clusters

Getting Deeper – CoreAdmin API

curl 'http://solrhost:8983/solr/admin/cores?action=CREATE&name=newcore&collection=revolution&shard=shard2'

collection

shard

numShards

collection.configName

Page 12: Administering and Monitoring SolrCloud Clusters

Schema – the API

Reading (Solr 4.2)FieldsDynamic fieldsTypesCopy fieldsName (4.3)Version (4.3)Unique Key (4.3)Similarity (4.3)

Writing (Solr 4.4)Adding new fieldsAdding copy fields

Page 13: Administering and Monitoring SolrCloud Clusters

Reading Your Schema

curl -XGET 'http://solrhost:8983/solr/rev/schema/fields/name'

Full reference: http://wiki.apache.org/solr/SchemaRESTAPI

{ "responseHeader" : { "status" : 0, "QTime" : 5 }, "field" : { "name" : "name", "type" : "text_general", "indexed" : true, "stored" : true }}

Page 14: Administering and Monitoring SolrCloud Clusters

Dynamic Schema Modifications<schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory>

curl -XPUT 'http://solrhost:8983/solr/rev/schema/fields/content' –d'{ "type" : "text", "stored" : "false", "copyFields" : ["catchAll"]}'

curl -XPOST 'http://solrhost:8983/solr/rev/schema/copyFields' -d '[ { "source" : "name", "dest" : [ "text", "personal" ] }]'

Page 15: Administering and Monitoring SolrCloud Clusters

The Right Directory

_0.fdt _0.fdx _0.fnm _0.nvd

_1.fdt _1.fdx _1.fnm _1.nvd

StandardDirectory

SimpleFSDirectory

NIOFSDirectory

MMapDirectory

NRTCachingDirectory

RAMDirectory <directoryFactory name="DirectoryFactory" class="solr.NRTCachingDirectoryFactory" />

Page 16: Administering and Monitoring SolrCloud Clusters

Segment Merging

a b c d e

Level 0 Level 1

cf g

Page 17: Administering and Monitoring SolrCloud Clusters

Segment Merge Under Control

Merge policy

Merge scheduler

Merge factor

Merge policy configuration

https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig

Page 18: Administering and Monitoring SolrCloud Clusters

Autocommit or Not?

<autoCommit> <maxTime>15000</maxTime> <maxDocs>1000</maxDocs> <openSearcher>false</openSearcher></autoCommit>

<autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>

Automatic data flush (hard commit)

Automatic index view refresh

Page 19: Administering and Monitoring SolrCloud Clusters

Caches

q=lucene+revolution

fq=city:Dublin

Solr Cache

Refreshed with IndexSearcher

Configurable

Different purposes

Different implementations

Page 20: Administering and Monitoring SolrCloud Clusters

Monitoring Importance

Page 21: Administering and Monitoring SolrCloud Clusters

What to Pay Attention to?

Page 22: Administering and Monitoring SolrCloud Clusters

Cluster State

Health

Shards and replica status

Shard placement

Failing nodes

Page 23: Administering and Monitoring SolrCloud Clusters

Indexing Related Metrics

Index throughput

Document distribution

I/O subsystem metrics

Merging

Page 24: Administering and Monitoring SolrCloud Clusters

Search - related Metrics

Count

Latency

Distribution among nodes

Anomalies and spikes

Page 25: Administering and Monitoring SolrCloud Clusters

Monitoring Memory and GC

Heap details

Pool size

Pool utilization

Garbage collection count

Garbage collection time

Page 26: Administering and Monitoring SolrCloud Clusters

Monitoring OS Related Metrics

CPU details

Load

I/O activity

Network usage

Page 27: Administering and Monitoring SolrCloud Clusters

Solr Administration Panel

Page 28: Administering and Monitoring SolrCloud Clusters

Solr & JMX<jmx />

java -Dcom.sun.management.jmxremote –jar start.jar

Page 29: Administering and Monitoring SolrCloud Clusters

Solr & JMX

Page 30: Administering and Monitoring SolrCloud Clusters

SPMIndex statistics

Request # and latency

Caches and warmup

CPU

JVM Memory and OS Memory

Garbage collector

OS related statistics

Page 31: Administering and Monitoring SolrCloud Clusters

SPM Dashboard

Page 32: Administering and Monitoring SolrCloud Clusters

Other Monitoring Tools

Ganglia http://ganglia.sourceforge.net/

New Relic http://www.newrelic.com/

Opsview http://www.opsview.com

Page 33: Administering and Monitoring SolrCloud Clusters

Too much is too much

Page 34: Administering and Monitoring SolrCloud Clusters

Too hot

Page 35: Administering and Monitoring SolrCloud Clusters

Caches

Page 36: Administering and Monitoring SolrCloud Clusters

We Are Hiring !

Dig Search ?Dig Analytics ?Dig Big Data ?Dig Performance ?Dig working with and in open – source ?We’re hiring world – wide !

http://sematext.com/about/jobs.html