Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity...
Transcript of Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity...
![Page 1: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/1.jpg)
1
Aravind PutrevuDeveloper | Evangelist@aravindputrevu | aravindputrevu.inelastic.co/community
Elasticsearch Search Engine on your server
![Page 2: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/2.jpg)
22
Agenda
Terms1
Mappings 3
Analyzers and Aggregations4
Capacity Planning5
Talking to Elasticsearch2
![Page 3: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/3.jpg)
33
Agenda
Terms1
Mappings3
Analyzers and Aggregations4
Capacity Planning5
Talking to Elasticsearch2
![Page 4: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/4.jpg)
44
Agenda
Terms1
Mappings3
Analyzers and Aggregations4
Capacity Planning5
Talking to Elasticsearch2
![Page 5: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/5.jpg)
55
Agenda
Terms1
Mappings3
Analyzers and Aggregations4
Capacity Planning5
Talking to Elasticsearch2
![Page 6: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/6.jpg)
66
Agenda
Terms1
Mappings3
Analyzers and Aggregations4
Capacity Planning5
Talking to Elasticsearch2
![Page 7: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/7.jpg)
7
Elastic StackNo enterprise edition
All new versions with 6.2
X-Pack
Security
Alerting
Monitoring
Reporting
Machine Learning
Graph
![Page 8: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/8.jpg)
8
Why it is Popular?
Speed Scale Relevance
![Page 9: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/9.jpg)
9
Terms
NodeCluster Index
Type Document
Shard Replica
https://www.elastic.co/guide/en/elasticsearch/reference/current/glossary.html
A cluster is a collection of one or more nodes (servers)A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities
An index is a collection of documents that have somewhat similar characteristics
*Deprecated in 6.0.0*A type used to be a logical category/partition of your index to allow you to store different
types of documents in the same index
A document is a basic unit of information that can be indexed. This document is expressed in
JSON (JavaScript Object Notation) which is a ubiquitous internet data interchange format.
Elasticsearch provides the ability to subdivide your index into multiple pieces called shards
To this end, Elasticsearch allows you to make one or more copies of your index’s shards into what are called replica shards, or replicas for
short
![Page 10: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/10.jpg)
10 All product names, logos, and brands are property of their respective owners and are used only for identification purposes. This is not an endorsement.
Elasticsearch Node Types
Elasticsearch
X-Pack
Master (3)
Ingest (X)
Machine Learning (2+)
Data – Warm (X)
Coordinating (X)
Data – Hot (X)
• Master Nodes– Control the cluster, requires a minimum of 3, one is active at any given time
• Data Nodes– Hold indexed data and perform data related operations– Differentiated Hot and Warm Data nodes can be used
• Coordinating Nodes– Route requests, handle search reduce phase, distribute bulk indexing– All nodes function as coordinating nodes
• Ingest Nodes– Use ingest pipelines to transform and enrich before indexing
• Machine Learning Nodes– Run machine learning jobs
Nodes can play one or more roles, for workload isolation and scaling
![Page 11: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/11.jpg)
11
What powers Elasticsearch?
https://www.elastic.co/blog/found-elasticsearch-top-down
● A Java library
● Great for full-text search
But
● Challenging to use
● Not designed for scale
![Page 12: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/12.jpg)
12
Talking to Elasticsearch
https://www.elastic.co/guide/en/elasticsearch/client/index.html
![Page 13: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/13.jpg)
13
Indexing a document
https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
![Page 14: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/14.jpg)
14
Inserting data _bulk
![Page 15: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/15.jpg)
15
Where will my data go?
The default value used for _routing is the document’s _id.
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html
0 < shard < number_of_primary_shards - 1
![Page 16: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/16.jpg)
16
Mappings
![Page 17: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/17.jpg)
17
Full Text Analysis
Inverted Index
![Page 18: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/18.jpg)
18
AnalyzerHelps in converting text into tokens for better search capability
Character filters
1 2 3
Tokenizer Token Filters
![Page 19: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/19.jpg)
19
Aggregations● Metrics
● Bucket
● Pipeline ● and so on...
![Page 20: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/20.jpg)
20
Querying Data
● Full Text Queries
● Term Level Queries
● Compound Queries
● Geo Queries
![Page 21: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/21.jpg)
21
Query DSL
Match Query
![Page 22: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/22.jpg)
22
Query DSL
Term Queries
![Page 23: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/23.jpg)
23
Query DSL
Nested queries
![Page 24: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/24.jpg)
24
Query DSL
Geo queries
![Page 25: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/25.jpg)
25
Beats
Log Files Metrics
Wire Data
Datastore Web APIs
Social Sensors
Kafka
Redis
MessagingQueue
ES-Hadoop
Elasticsearch
Kibana
Master Nodes (3)
Ingest Nodes (X)
Data Nodes – Hot (X)
Data Notes – Warm (X)
Instances (X)
your{beat}
X-Pack X-Pack
Logstash
Nodes (X)
Custom UI
LDAP
Authentication
AD
Notification
SSO
Hadoop Ecosystem
![Page 26: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/26.jpg)
26
Capacity Planning
It depends...
https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
![Page 27: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/27.jpg)
27
Capacity PlanningWhat is your use case?
● Full text search
● Logging/Metrics
● Complex Aggregations with lot of users
Each use case needs a different cluster configuration.
https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
![Page 28: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/28.jpg)
28
Capacity PlanningLet us take Logging..● Inflow of data per day
○ Per day : 10GB○ Per Month : 300GB○ Per Year: 3600GB
● Data Retention ○ 15 days
● High Availability (Replication factor)○ 1 i.e., 7200GB Per Year
● Type of Queries
Master Node : X
Data Node : X
https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
![Page 29: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/29.jpg)
29
Capacity PlanningHardware Recommendations
● SSD’s are the best
● Local Disk is king!
● Prefer Medium size machine’s over Large size machine’s
● Only 50% of your RAM to Elasticsearch
● Don’t Cross 32GB Java Heap Space
https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
![Page 30: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/30.jpg)
30
Beats
Log Files Metrics
Wire Data
Datastore Web APIs
Social Sensors
Kafka
Redis
MessagingQueue
Logstash
ES-Hadoop
Elasticsearch
Kibana
Nodes (X)
Master Nodes (3)
Ingest Nodes (X)
Data Nodes – Hot (X)
Data Notes – Warm (X)
Instances (X)
your{beat}
X-Pack X-Pack
Custom UI
LDAP
Authentication
AD
Notification
SSO
Hadoop Ecosystem
https://www.elastic.co/blog/hot-warm-architecture-in-elasticsearch-5-x
![Page 31: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/31.jpg)
31
training.elastic.co
![Page 32: Search Engine on your server2 Agenda 1 Terms 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning 2 Talking to Elasticsearch](https://reader035.fdocuments.us/reader035/viewer/2022070901/5f45d3c4686b1e75e527813c/html5/thumbnails/32.jpg)
Resources
• https://www.elastic.co/learn• https://www.elastic.co/blog/category/engineering• https://discuss.elastic.co/• https://fb.com/groups/ElasticIndiaUserGroup• https://elastic.co/community
32