Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic...
Transcript of Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic...
![Page 1: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/1.jpg)
Centralized logs the ElasticSearch way
by eznam.cz
Jan Šimák @ infra SCIF
![Page 2: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/2.jpg)
everal years ago
![Page 3: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/3.jpg)
past is ues
![Page 4: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/4.jpg)
pre ent
![Page 5: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/5.jpg)
and still continues
the journey
was long
![Page 6: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/6.jpg)
at the beginning
filebeat logstash elasticsearch
kibanaelastic api
![Page 7: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/7.jpg)
next tep
filebeat logstash elasticsearch
kibanaelastic apikafka
![Page 8: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/8.jpg)
now we are here
cross-cluster search
kibana+
es api
dc X dc XX dc XXX dc XXXX
loadbalancer “labrador”
![Page 9: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/9.jpg)
clu ter breakdown
role: master
role: data+ingestrole: data+ingestnode.attr.type: inactive-logsnode.attr.type: indexing-nodes
![Page 10: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/10.jpg)
clu ter breakdownrole: master
5 instances
OpenStack vm
15GB RAM
14 vCPU
![Page 11: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/11.jpg)
clu ter breakdownrole: data+ingestnode.attr.type: indexing-nodes
16 instances
bare metal
128GB RAM
40 HT CPU
4 2TB SSD w/o RAID
vendor szn montovna
![Page 12: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/12.jpg)
clu ter breakdownrole: data+ingestnode.attr.type: inactive-nodes
48 instances
bare metal
64 - 128GB RAM
24 - 40 HT CPU
SAS or SSD, various sizes, w/ or w/o RAID
various vendors
![Page 13: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/13.jpg)
clu ter breakdown
common jvm configuration (except heap size)
common elasticsearch.yml (except roles and attrs)
mix of versions:
5.6.x (first cluster)
6.8.x (second cluster, CCS)
no docker deployment (but we love docker)
Ansible deployment
![Page 14: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/14.jpg)
cross-clu ter search
2 instances per DC
OpenStack vm
32GB RAM
24 vCPU
dedicated ES cluster
keeps kibana’s indices only
![Page 15: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/15.jpg)
ecurity, what ?
all clusters are secured by
PKI for:
mutual node communication
rest api communication
all in one CA for a client or a server
![Page 16: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/16.jpg)
how do we manage E ?curator
periodic cron jobs
cerebro
third party web UI for a visual overview
ad-hoc r/w operational tasks
directly through the rest api (port 9200)
ad-hoc r/w operational tasks
_cat API army swiss knife
![Page 17: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/17.jpg)
index managementindex rollover based on #documents
#primary shards = #indexing-nodes - 2
keep shard size small enough ~40GB
alerting on shard size
alerting on empty index
some indices still have date suffix -> obsolete
![Page 18: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/18.jpg)
note: show live demo if you dare
![Page 19: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/19.jpg)
note: show live demo if you dare
![Page 20: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/20.jpg)
troubles aka F…..to many indices/shards -> done
md freeze under heavy load -> done
small disks on indexing nodes -> done
HDD on indexing nodes -> done
100% SSD disk utilization causing 429 -> done
mixing ES versions in one cluster -> done
wrong kafka configuration for internal topic -> done
kafka consumer group rebalances -> open
![Page 21: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/21.jpg)
![Page 22: Centralized logs the ElasticSearch way - LinuxDays › media › centralized... · curator periodic cron jobs cerebro third party web UI for a visual overview ad-hoc r/w operational](https://reader033.fdocuments.us/reader033/viewer/2022052801/5f115dd4a34d41632c0f6a4a/html5/thumbnails/22.jpg)