Post on 06-Jan-2017
Elasticsearch & Docker
Rafał Kuć – Sematext Group, Inc.@kucrafal @sematext sematext.com
Running High PerformanceFault Tolerant
Elasticsearch Clusters On Docker
About me…
Sematext consultant & engineerSolr.pl co-founderFather and husband :)
Next 30 minutes
You Are Probably Familiar With This
Development
You Are Probably Familiar With This
Development Test
You Are Probably Familiar With This
Development Test QA
You Are Probably Familiar With This
Development Test QA
Production environment
And The Problems That Come With It
Resources not utilized
And The Problems That Come With It
Resources not utilized
OverprovisionedServers
And The Problems That Come With It
Resources not utilized
OverprovisionedServers
≠ ≠
The solution
Development Test QA Production
Container Technologies
What is Docker?
Lightweight
Based onOpen Standards
Secure
Containers vs Virtual Machines
Hardware
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Traditional Virtual MachineContainer
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Traditional Virtual MachineContainer
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Libraries Libraries
Application 1 Application 2
Traditional Virtual MachineContainer
What is Elasticsearch?
Reasonabledefaults { JSON }{ JSON }
Distributed by design
http://www.dailypets.co.uk/2007/06/17/kittens-rest-at-half-time/
Running Official Elasticsearch Container
$ docker run -d elasticsearch
Running Official Elasticsearch Container
$ docker run -d elasticsearch == docker run -d elasticsearch:latest
Running Official Elasticsearch Container
$ docker run -d elasticsearch:1.7
$ docker run -d elasticsearch == docker run -d elasticsearch:latest
Running Official Elasticsearch Container
$ docker run -d elasticsearch == docker run -d elasticsearch:latest
$ docker run --name es_1 -h es_master_1 elasticsearch
$ docker run -d elasticsearch:1.7
Running Official Elasticsearch Container
$ docker run -d elasticsearch == docker run -d elasticsearch:latest
$ docker run --name es_1 -h es_master_1 elasticsearch
$ docker run -d elasticsearch:1.7
Container Constraints
$ docker run -d -m 2G elasticsearch
http://docs.docker.com/engine/reference/run/
Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
http://docs.docker.com/engine/reference/run/
Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
$ docker run -d --cpuset-cpus="1,3" elasticsearch
http://docs.docker.com/engine/reference/run/
Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
$ docker run -d --cpuset-cpus="1,3" elasticsearch
http://docs.docker.com/engine/reference/run/
$ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch
Creating Optimized Image
Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/
Creating Optimized Image
Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/
$ docker build -t devops/example .
Creating Optimized Image
Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/
$ docker build -t devops/example .
Sending build context to Docker daemon 3.072 kBStep 1 : FROM elasticsearch ---> 8112755253f1Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ ---> Using cache ---> c9ca48a22e58Successfully built c9ca48a22e58
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=0.0.0.0
Network - Good Practices
Separate network for Elasticsearch cluster
Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers$ docker run -d -h es_node_1 elasticsearch
Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers$ docker run -d -h es_node_1 elasticsearch
Expose 9200 & 9300 ports only for client nodes
Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers$ docker run -d -h es_node_1 elasticsearch
Expose 9200 & 9300 ports only for client nodes
Elasticsearch data & client nodes point to masters only
Dealing With Storage
By default in /usr/share/elasticsearch/data
Dealing With Storage
By default in /usr/share/elasticsearch/data
By default not persisted
Dealing With Storage
By default in /usr/share/elasticsearch/data
By default not persisted
$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch
Dealing With Storage
$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch
By default in /usr/share/elasticsearch/data
By default not persisted
Use data only containers
Permissions
Data-Only Docker Volumes
Bypasses Union File System
Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch
Permissions
Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch
$ docker run --volumes-from esdata elasticsearch
Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
minimum_master_nodes = N/2 + 1
Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
minimum_master_nodes = N/2 + 1
recovery.after.nodes recovery.expected.nodes
cluster.routing.allocation.node_concurrent_recoveries
index.unassigned.node_left.delayed_timeoutindex.priority
Master Nodes & Docker
$ docker run -d elasticsearch -Dnode.master=true -Dnode.data=false -Dnode.client=false
Client Nodes & Docker
$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=false -Dnode.client=true
Data Nodes & Docker
$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=true -Dnode.client=false
Scaling
Elasticsearch Node Elasticsearch Node
Elasticsearch Node Elasticsearch Node
Scaling
curl -XPUT 'http://localhost:9200/devops/' -d '{ "settings" : { "index" : { "number_of_shards" : 4, "number_of_replicas" : 0 } }}'
Scaling
P P
P P
Scaling
curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1}'
Scaling
P P
P P
R
R R
R
Scaling
curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 2}'
Scaling
P P
P P
R R
R R
R R
R R
Scaling
curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1}'
Scaling
P P
P P
R
R R
R
Scaling
P P
P P
R
R R
R
Scaling
P P PP
UnassignedR
RR
R
RAM Buffer
indices.memory.index_buffer_size: 10%indices.memory.min_index_buffer_size: 48mb
indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb
RAM Buffer
indices.memory.index_buffer_size: 10%indices.memory.min_index_buffer_size: 48mb
indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb
Higher IndexingThroughput
Lower IndexingThroughput
defaults ><
Time-Based Data?
2015-11-23
TODAY
WEEK
Time-Based Data?
curl -XPOST 'http://localhost:9200/_aliases' -d '{ "actions" : [ { "add" : {"index":"2015-11-23","alias":"today"} }, { "add" : {"index":"2015-11-23","alias":"week"} } ]}'
Time-Based Data?
2015-11-23 2015-11-24
TODAY
WEEK
Time-Based Data?
2015-11-23 2015-11-24 2014-11-25
TODAY
WEEK
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
Multiple Tiers
curl -XPUT 'localhost:9200/data_2015-11-23' -d '{ "settings": { "index.routing.allocation.include.tag" : "hot" }}'
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
data_2015-11-23
data_2015-11-23
Multiple Tiers
curl -XPUT 'localhost:9200/data_2015-11-23/_settings' -d '{ "settings": { "index.routing.allocation.exclude.tag" : "hot", "index.routing.allocation.include.tag" : "cold", }}'
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
data_2015-11-23
data_2015-11-23
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
data_2015-11-23
data_2015-11-23
data_2015-11-24
data_2015-11-24
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
data_2015-11-23
data_2015-11-23
data_2015-11-25
data_2015-11-25
data_2015-11-24
data_2015-11-24
Multiple Tenants
Multiple Tenants
Hot
Hot
Cold
Cold
Cold
Cold
Multiple Tenants
Hot
Hot
Cold
Cold
Cold
Cold
ROUTING
Indexing Without Routing
Shard 1 Shard 2 Shard 3 Shard 4
Shard 5 Shard 6 Shard 7 Shard 8
Elasticsearch
Application
userA
userA
userA
userAuserAuserA
userAuserA
Indexing With Routing
Shard 1 Shard 2 Shard 3 Shard 4
Shard 5 Shard 6 Shard 7 Shard 8
Elasticsearch
Application
user
A
Querying Without Routing
Shard 1 Shard 2 Shard 3 Shard 4
Shard 5 Shard 6 Shard 7 Shard 8
Elasticsearch
Application
Querying With Routing
Shard 1 Shard 2 Shard 3 Shard 4
Shard 5 Shard 6 Shard 7 Shard 8
Elasticsearch
Application
Routing vs No Routing
Queries without routing (200 shards, 1 replica)#thre
adsAvg response time Throughput 90%
lineMedian
CPU Utilization
1 3169ms 19,0/min 5214ms
2692ms
95 – 99%
Routing vs No Routing
Queries without routing (200 shards, 1 replica)#thre
adsAvg response time Throughput 90%
lineMedian
CPU Utilization
1 3169ms 19,0/min 5214ms
2692ms
95 – 99%
Queries with routing (200 shards, 1 replica)#thre
adsAvg response time Throughput 90%
lineMedian
CPU Utilization
10 196ms 50,6/sec 642ms
29ms 25 – 40%
20 218ms 91,2/sec 718ms
11ms 10 – 15%
Monitoring
https://sematext.com/spm/integrations/docker-monitoring.html
https://github.com/sematext/spm-agent-docker
Short summary
http://www.soothetube.com/2013/12/29/thats-all-folks/
We Are Hiring!Dig Search?Dig Analytics?Dig Big Data?Dig Performance?Dig Logging?Dig working with, and in, open–source?We’re hiring worldwide!
http://sematext.com/about/jobs.html
Rafał Kuć @kucrafal rafal.kuc@sematext.com
Sematext @sematext http://sematext.com http://blog.sematext.com
Thank You !