Post on 18-Jul-2015
§ § § § § §
Agenda
Why Elasticsearch?
Fuzzy Searches
Relevant Searches
Pagination
Aggregations
§ § § § § §
Derivatives
Suggestions
Elasticsearch vs RelationalDB
Node 1
index = books
index = blogs
RDBMS
database = books
database = blogs
type = travel
type = cooking
type = travel
type = cooking
table= travel
table= cooking
table = travel
table = cooking
Scaling & Sharding
Node 1
primary shard 1
primary shard 2
primary shard 3
primary shard 4
primary shard 5
Scaling & Sharding contd..
Node 1
Node 2
Node 3
primaryshard 1
primary shard 2
replica shard 3
primaryshard 3
primaryshard 4
replica shard 5
replica shard 2
primary shard 5
replica shard 4
replica shard 1
How to use Elasticsearch?
Application
Database Elasticsearch
http://www.elasticsearch.org/download
Installation
CRUD Operations
Create
curl –XPUT localhost:9200/books/travel/1 –d ‘ { ”title": "Journey's of a Life Time - 500 Best Trips", "author": "National Geographic", "releaseDate": "2013-10-25T19:00" }
Get
Mapping
Update
Concurrency Control
Document , version 1
Document , version 2
Document , version 2
FAIL
SUCCESS
Delete
§
§ § § § § §
Search
Search
Query Search & Pagination
Search
Analyzers
Character Filtering
Tokenizer
Lowercase
Stopwords
Synonyms
“I Love to Travel & Work.”
Analyzed Data
“I Love to Travel and Work”
“I”, “Love”, “to”, “Travel”, “and”, “Work”
“i”, “love”, “to”, “travel”, “and”, “work”
“i”, “love”, “travel”, “work”
“i”, “like”, “travel”, “work”
Inverted Index
Terms Documents Frequencies
i id1 id1 -> 1
like id1, id3 id1 -> 1, id3 -> 5
travel id1, id2 id1 -> 3, id2 -> 2
work id2, id3, id5 id2 -> 3, id3 -> 1, id5 -> 2
Inverted Index An inverted index is a data structure that maps a word to a document.
Full Text Search "author": { "type": "string” “index”: “analyzed”, “analyzer”: “french” }
Exact String Search "author": { "type": "string” “index”: “not_analyzed” }
Not Searchable "author": { "type": "string” “index”: “no” } }
Analyzers available in Elasticsearch
§ § § §
Types of Queries
Request with Query & Filters
Filters vs Queries exact match full text search
binary yes/no relevance scoring
fast relatively slow
cacheable not cacheable
§
§
Aggregations
Handling Typos with N-Grams
Uni-gram = e, l, a, s, t, i, c
Bi-gram = el, la, as, st, ti, ic
Tri-gram = ela, las, ast, sti, tic
Handling Typos with N-Grams
Autocomplete with Edge Grams
Search Results
travel
training
transfer
§ § § § §
§ § §
§ §
Battle Tested Lessons
§ § § § § § § §
Battle Tested Lessons contd..
Let’s Connect