Elasticsearch in production New York Meetup at Twitter October 2014

70
Elasticsearch in production Konrad Beiske [email protected] @beiske

description

Elasticsearch easily lets you develop amazing things, and it has gone to great lengths to make Lucene's features readily available in a distributed setting. However, when it comes to running Elasticsearch in production, you still have a fairly complicated system on your hands: a system with high demands on network stability, a huge appetite for memory, and a system that assumes all users are trustworthy. This talk will cover some of the lessons we've learned from securing and herding hundreds of Elasticsearch clusters.

Transcript of Elasticsearch in production New York Meetup at Twitter October 2014

Page 1: Elasticsearch in production New York Meetup at Twitter October 2014

Elasticsearch in production !

Konrad Beiske [email protected]

@beiske

Page 2: Elasticsearch in production New York Meetup at Twitter October 2014

Who?

Senior software engineer of Found AS Working with Elasticsearch for 2 years

Herding hundreds of Elasticsearch clusters

Page 3: Elasticsearch in production New York Meetup at Twitter October 2014

Agenda

Page 4: Elasticsearch in production New York Meetup at Twitter October 2014

Agenda• Anti-patterns

• Memory / Resource Usage

• Distributed problems

• Security

• Client concerns

• Changing a cluster

Page 5: Elasticsearch in production New York Meetup at Twitter October 2014

found.no/foundation

Page 6: Elasticsearch in production New York Meetup at Twitter October 2014
Page 7: Elasticsearch in production New York Meetup at Twitter October 2014
Page 8: Elasticsearch in production New York Meetup at Twitter October 2014
Page 9: Elasticsearch in production New York Meetup at Twitter October 2014
Page 10: Elasticsearch in production New York Meetup at Twitter October 2014

Snapshot / Restore

Circuit breakersDocument values

Aggregations

Distributed percolation

Suggesters

Page 11: Elasticsearch in production New York Meetup at Twitter October 2014

Snapshot / Restore

Circuit breakersDocument values

Aggregations

Distributed percolation

Suggesters

Page 12: Elasticsearch in production New York Meetup at Twitter October 2014

Anti-Patterns

Page 13: Elasticsearch in production New York Meetup at Twitter October 2014

Arbitrary Keys

• “Schema Free”

• One field per value

• Ever-growing cluster state

acls: 1234: READ 42: WRITE

Page 14: Elasticsearch in production New York Meetup at Twitter October 2014

Heavy Updating

• Update = Delete + Reindex

• Be careful with counters

Page 15: Elasticsearch in production New York Meetup at Twitter October 2014

Slow queries

• WHERE foo ILIKE ‘%bar%’

• {“query_string”: {“query”: “foo:*bar*”}}

Page 16: Elasticsearch in production New York Meetup at Twitter October 2014

Arbitrary searches

query: filtered: filter: term: user_id: 42 query: [user’s query here]

Page 17: Elasticsearch in production New York Meetup at Twitter October 2014
Page 18: Elasticsearch in production New York Meetup at Twitter October 2014

Time Bomb

Page 19: Elasticsearch in production New York Meetup at Twitter October 2014

Memory

Page 20: Elasticsearch in production New York Meetup at Twitter October 2014

Memory• Field caches

• Filter caches

• Page caches

• Aggregations

• Index building

Page 21: Elasticsearch in production New York Meetup at Twitter October 2014

Page Cache

• Keeping index pages in memory

• Can’t have too much

• Outgrow: Gradual slowdown

Page 22: Elasticsearch in production New York Meetup at Twitter October 2014

Heap Space

• Memory used by Elasticsearch process

• Field / Filter caches

• Aggregations

Page 23: Elasticsearch in production New York Meetup at Twitter October 2014

Time Bomb

Page 24: Elasticsearch in production New York Meetup at Twitter October 2014

Time Bomb

Page 25: Elasticsearch in production New York Meetup at Twitter October 2014

OutOfMemoryError

Woah there

I ate all the memories

Your cluster may or may not work any more

Page 26: Elasticsearch in production New York Meetup at Twitter October 2014

OutOfMemory

• Growing too big

• Selecting too big timespan in Kibana

• Document ingestion peak

Page 27: Elasticsearch in production New York Meetup at Twitter October 2014

Preventing OOMs• Have enough memory :-)

• Understand your search’s memory profile

• Bulk / Circuit breaker settings

• Monitoring

• Document values

Page 28: Elasticsearch in production New York Meetup at Twitter October 2014

Marvel( /_stats )

Page 29: Elasticsearch in production New York Meetup at Twitter October 2014
Page 30: Elasticsearch in production New York Meetup at Twitter October 2014
Page 31: Elasticsearch in production New York Meetup at Twitter October 2014

Document Values

Page 32: Elasticsearch in production New York Meetup at Twitter October 2014

"my_field": { "type": "string", "fielddata": { "format": "doc_values" } }

Page 33: Elasticsearch in production New York Meetup at Twitter October 2014

Sizing

Page 34: Elasticsearch in production New York Meetup at Twitter October 2014

Sizing

• Test, don’t guess

• Start big, scale down

• Index, search, monitor

Page 35: Elasticsearch in production New York Meetup at Twitter October 2014
Page 36: Elasticsearch in production New York Meetup at Twitter October 2014
Page 37: Elasticsearch in production New York Meetup at Twitter October 2014
Page 38: Elasticsearch in production New York Meetup at Twitter October 2014

Glitch Meltdown

Page 39: Elasticsearch in production New York Meetup at Twitter October 2014
Page 40: Elasticsearch in production New York Meetup at Twitter October 2014

Glitch Meltdown

Page 41: Elasticsearch in production New York Meetup at Twitter October 2014

Glitch Meltdown

Page 42: Elasticsearch in production New York Meetup at Twitter October 2014

Glitch Meltdown

Page 43: Elasticsearch in production New York Meetup at Twitter October 2014

Glitch Meltdown

Page 44: Elasticsearch in production New York Meetup at Twitter October 2014
Page 45: Elasticsearch in production New York Meetup at Twitter October 2014
Page 46: Elasticsearch in production New York Meetup at Twitter October 2014

• Tie-breaker can be a cheap master-node

• Applies to data centers / availability zones too

Page 47: Elasticsearch in production New York Meetup at Twitter October 2014

Data-only nodes

Master-only nodes

Page 48: Elasticsearch in production New York Meetup at Twitter October 2014
Page 49: Elasticsearch in production New York Meetup at Twitter October 2014

Jepsen

Page 50: Elasticsearch in production New York Meetup at Twitter October 2014

Jepsen

• Kyle Kingsbury’s series on distributed systems

• Distributed systems are hard

• aphyr.com

Page 51: Elasticsearch in production New York Meetup at Twitter October 2014

Security

Page 52: Elasticsearch in production New York Meetup at Twitter October 2014

Security

• “Not my job!” – Elasticsearch

• That’s fine!

Page 53: Elasticsearch in production New York Meetup at Twitter October 2014

Dynamic Scripts

!

• Scoring

• Aggregations

• Updating

Page 54: Elasticsearch in production New York Meetup at Twitter October 2014

Dynamic Scripts

Runtime.getRuntime().exec(…)

Page 55: Elasticsearch in production New York Meetup at Twitter October 2014

Security

!

• Disable dynamic scripts

• Mind index patterns

• Even then, don’t accept arbitrary requests

Page 56: Elasticsearch in production New York Meetup at Twitter October 2014

Client Concerns

Page 57: Elasticsearch in production New York Meetup at Twitter October 2014

Client Concerns

• Connection pools

• Idempotent requests

• Have sane syncing/indexing strategies

Page 58: Elasticsearch in production New York Meetup at Twitter October 2014
Page 59: Elasticsearch in production New York Meetup at Twitter October 2014

# BOOM !

Page 60: Elasticsearch in production New York Meetup at Twitter October 2014

Cluster changes

Page 61: Elasticsearch in production New York Meetup at Twitter October 2014

Cluster changes

• Make new nodes join existing cluster

• No rolling restarts

• Easy rollback if things go bad

Page 62: Elasticsearch in production New York Meetup at Twitter October 2014

v1.0.0 v1.0.1

Page 63: Elasticsearch in production New York Meetup at Twitter October 2014

v1.0.0 v1.0.1

Page 64: Elasticsearch in production New York Meetup at Twitter October 2014

v1.0.0 v1.0.1

Page 65: Elasticsearch in production New York Meetup at Twitter October 2014

v1.0.0 v1.0.1

Page 66: Elasticsearch in production New York Meetup at Twitter October 2014

v1.0.0 v1.0.1

Page 67: Elasticsearch in production New York Meetup at Twitter October 2014

Cluster changes

• Test first

• Mind recover_*-settings

Page 68: Elasticsearch in production New York Meetup at Twitter October 2014

Multi-Cluster Workflows

• Snapshot/Restore

• Operations across clusters

• Swap clusters!

• Works well with good syncing strategy

Page 69: Elasticsearch in production New York Meetup at Twitter October 2014

Misc

• Same JVM

• ulimits

• Unicast and cluster name

• SSD? noop-scheduler

Page 70: Elasticsearch in production New York Meetup at Twitter October 2014

@foundsays

Learn More! !

found.no/foundation

@beiskeFollow