Metrics collectionusing open-source tools
Yaniv Bronhaim.
Maintainer @ VDSM
Senior Software EngineerRed Hat Israel
Note that I expect the notes to be available to the public as well - so I'd improve them a bit too. (for example, operation -> operations is a typo, but also expand on 'Intro about me')why? I share this google slide only with you and oved - after the session I'll upload it to slideshare. that is my usual flowI'd remove it.The Maintainer @ RHV part.why?If you mention open-source then I'd write Maintainer @ VDSM.
Also, after a strong opening of the session you can add a slide on who you are. After slide 2 I guessThe opening is a mix of the 3 first slides, as part of the first one I present myself - that's the plan at least
Operations FlowsDisk Operations
Failures
Traffic
DataWe will discuss about data, types of data, types of presentation of the data and what we generally do with data.
http://metrics20.org/spec/Data TypesLogsMetrics
Sources - data rootscounters: how many likes\fails\views\sign-ins\accesses in certain time - always raisecount: whats the avg rate in 5 minutes of sign-insgauge: specific value with timestamp - read\write speed over time, new tcp connections over timerates: specific value per secondflows in logs - migrations, restarts.. also can be gauge or rates or count but need to be calculated manually
Example for data processing architecture goalsData analysis
(Billing, auto scaling, alerts)
Correlate between distributed logs and metrics
Scale easily
Historical DWH - data aggregation
Setting alarms
Sources - data rootscounters: how many likes\fails\views\sign-ins\accesses in certain time - always raisecount: whats the avg rate in 5 minutes of sign-insgauge: specific value with timestamp - read\write speed over time, new tcp connections over timerates: specific value per secondflows in logs - migrations, restarts.. also can be gauge or rates or count but need to be calculated manually
ClientShipperStore
Visualization
Data Processing Pipeline
Multiplayers game serviceSimple Case Study #1
Case study #1 - Title?changed
Save historical info about the hardware and set alarms and events in the system based on traffic and usage.
Parse multiply logs constantly and visualize statistics based on logs data.
Simple Case Study #1 - Goals
better analysis and management for manual scaling
want to reward players based on events - taken from the logs
Case Study #1 - CollectD + Graphite
Metrics analysis solution for monitoring and aggregation.
CollectDGraphite(Carbon)
what do you mean by scaling? graphite is not that scalableok. good point. so monitoring and aggregation rules for the data
Case Study #1ELK Stack
Metrics analysis solution for monitoring and scaling
better analysis and management for manual scaling
want to reward players based on events - taken from the logs
Case Study #1 - ELK Stack
2. Log analysis solution for dashboards.
FileBeat LogStash Elasticsearch Kibanabetter analysis and management for manual scaling
want to reward players based on events - taken from the logs
Case Study #1
Case Study #2
Large scale, centralized management for KVM based virtualization
Focus on ease of use/deployment
Case Study #2 - Title?
Historical DWH
polling
Case Study #2 - GoalsCollect basic hardware info and remove such logic from VDSM.
VDSMoVirt-EngineGet oVirt-Engine info same as physical and virtual entities info.
Historical DWH
polling
Case Study #2 - GoalsAllow building monitoring Dashboard based on historical data for last XX years with aggregation configs.
VDSMoVirt-EngineCorrelate between virtual to physical data.
Case Study #2 - Using metrics and logs
Getting data (metrics and logs)
Parse data to store format (json)
CollectDFluentD
metrics and logsWhat's the motivation for overlaying images one on top of the other? I'd split to slides instead.its nice with the animationI didn't like it, but up to you.Case Study #2 - Correlate data and store
Correlate between data sources and store
Scale up abilities
FluentDElasticsearch
the sentence does not make sense...Correlation and store sounds more sense?Correlate between different data sources before storing it.Case Study #2 - centralization and store Building dashboards and monitors - Analyze, visualize, alerts and alarm definitions
GrafanaKibana
Case Study #2 - Output
OUR GOAL IS TO LEAD IN SCALE, MANAGMENT, USER FRIENDLY-OS ALTERNATIVE == PROS AND CONS-RELEASE CYCLE EVERY 6 MONTHS, 3 STABLE BRANCH THAT FULLY SUPPORTED-FEATURE REACH EVERYONE CAN REGUEST-BASE KVM
Case Study #2 - Output
OUR GOAL IS TO LEAD IN SCALE, MANAGMENT, USER FRIENDLY-OS ALTERNATIVE == PROS AND CONS-RELEASE CYCLE EVERY 6 MONTHS, 3 STABLE BRANCH THAT FULLY SUPPORTED-FEATURE REACH EVERYONE CAN REGUEST-BASE KVM
Case Study #2 - Output
OUR GOAL IS TO LEAD IN SCALE, MANAGMENT, USER FRIENDLY-OS ALTERNATIVE == PROS AND CONS-RELEASE CYCLE EVERY 6 MONTHS, 3 STABLE BRANCH THAT FULLY SUPPORTED-FEATURE REACH EVERYONE CAN REGUEST-BASE KVM
prometheus.iohawkular.org
ShippersBottom line: Investigate your architectureStoresVisualizationClients
http://rancher.com/converting-prometheus-template-cattle-kubernetes/prometheus - monitoring alerting system, built at SoundCloud http://snowplowanalytics.com/product/
Some links for more infohttps://bronhaim.wordpress.com/2016/07/24/setup-toturial-for-collecting-metrics-with-statsd-and-grafana-containers/https://www.ovirt.org/oVirt metrics
And many more howto tutorials.https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04https://www.digitalocean.com/community/tutorials/how-to-configure-collectd-to-gather-system-metrics-for-graphite-on-ubuntu-14-04https://www.infoq.com/articles/graphite-intro
Or reach me - [email protected]
Top Related