Prometheus
Monitoring system and TSDB:
● instrumentation● metrics collection and storage● querying● alerting● dashboarding / graphing / trending
Made for dynamic cloud environments!
What is Prometheus?https://prometheus.io
Prometheus
● raw log / event collection● request tracing● “magic” anomaly detection● durable long-term storage● automatic horizontal scaling● user / auth management
What does Prometheus NOT do?
Prometheus
● Started in 2012 at SoundCloud by Matt and Julius● Inspired by Google’s monitoring tools● Motivation
○ needed to monitor dynamic cloud environment○ unsatisfying data models, querying, and efficiency in
existing approaches
Origin
Prometheus
Four main improvements
1. Multi-dimensional data model (like OpenTSDB).2. Powerful query language (the same for exploring, graphing, alerting).3. Efficient data collection (yes, it's pull, not push).4. Operational simplicity (unlike OpenTSDB).
Prometheus
Multi-dimensional data model
api_http_requests_total{method="GET", endpoint="/api/tracks", status="200"} 2034834
Prometheus
Powerful query language
topk(3, sum(rate(bazooka_instance_cpu_time_seconds_total[5m])) by (app, proc))
sort_desc(sum(bazooka_instance_memory_limit_bytes - bazooka_instance_memory_usage_bytes) by (app, proc))
Prometheus
Efficient data collection
1000s of targets.800,000 samples per second.
Millions of time series.On a single monitoring server.
Running many servers is easy, too…Pull, not push.
Prometheus
Challenges in Dynamic Environments
● on-demand VMs (EC2, Azure, GCP, ...)● dynamically scheduled service instances
(Kubernetes, Docker Swarm, ...)● microservices
⇨ many services, dynamic hosts, and ports
How to make sense of this mess?
Prometheus
Monitoring Dynamic Environments
● Use service discovery○ ...to know what should be there○ ...to pull metrics○ ...to add metadata to metrics
● Focus on services, not machines
Prometheus
...with Prometheus● configure service in Prometheus
○ automatic discovery and scraping● map host, port, service etc. into
dimensions● query language enables:
○ service-level aggregation○ instance-level drill-down○ precise alerting
Prometheus
Prometheus <3 Kubernetes
● Borg -> Kubernetes● Borgmon -> Prometheus● both use labels● Prometheus supports Kubernetes SD● Kubernetes has Prometheus metrics
Top Related