From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity NYC 2015)
-
Upload
sematext-group-inc -
Category
Technology
-
view
8.335 -
download
1
Transcript of From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity NYC 2015)
From zero to production hero:
Log analysis with Elasticsearch
Rafał KućRadu Gheorghe
Who are we?RaduRafał
Our Company → SematextHQ: NYC + Globally Distributed Team
Search & Big Data Consulting
Production Support for Solr & Elasticsearch
Training for Solr & Elasticsearch (online and
onsite)Training in NYC
next week! Oct 19 & 20
Our Company → Sematext
AgendaKibana
Elasticsearch essentials, tuning and scaling
Logstash
rsyslog
Logstash + rsyslog
Commands & Configs:https://github.com/sematext/velocity
Lucene Essentials
{"verb": "GET"}
document
Lucene Essentials
{"verb": "GET"}
1)GETdocument
stored
Lucene Essentials
GET 1,3,5
PUT 2,4
{"verb": "GET"}
1)GETdocument
stored
indexed
Analysis
(Macintosh; Intel Mac OSX; en)
["Macintosh", "Intel", "Mac", "OSX", "en"]
["macintosh", "intel", "mac", "osx", "en"]
standard tokenizer
lowercase token filter
Field data
GET 1,2
PUT 2,3
Field data
GET 1,2
PUT 2,3
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
expensive
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
expensive
heap
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
expensive
heap
http://bio-img.s3.amazonaws.com/bds/formhdr-cvr-5-memory-killing-foods-v2.png
DocValues
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
at index time; on disk
https://www.lorextechnology.com/images/products/HDD250GB/900x600/security-certified-HDD250GB-L1.png
DocValues
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
no uninverting!
at index time; on disk
https://www.lorextechnology.com/images/products/HDD250GB/900x600/security-certified-HDD250GB-L1.png
OS caches instead of heap
Logstash
/var/log/apache.log
GET /index.html
grok{
"verb": "GET","path": "/index.html"
}
- w $numberOfWorkers
workers => 2
filter
output
input
Elasticsearch
rsyslog
/var/log/apache.log
GET /index.html
mmnormalize{
"verb": "GET","path": "/index.html"
}
queue.workerThreadsqueue.dequeueBatchSize
omelasticsearch
imfile input module
Elasticsearch
main queue (RAM+Disk)
queue.typequeue.size
...
mmnormalize parse tree
sys
tem log
d -ng
=> scales very well with # of rules(performance depends more on log length)
rsyslog + Redis via Kafka
rsyslog Apache Kafka Logstash Elasticsearch
file input
mmnormalize
omkafka +JSON template
Kafka input +JSON codec Elasticsearch
output
Free eBooks @ sematext.com
We are hiring toohttp://sematext.com/about/jobs.html