Monitoring Containers with Weave Scope
-
Upload
weaveworks -
Category
Internet
-
view
296 -
download
0
Transcript of Monitoring Containers with Weave Scope
Monitoring ContainersDavid Kaltschmidt
@davkals
Rogue waves present considerable danger for several reasons:
• unpredictable
• may appear suddenly or without warning
• and can impact with tremendous force.
Performance Methodologies
• For system engineers- ways to analyse unfamiliar systems
• For app developers- guidance for metric and dashboard design
- Brendan Gregg’s Systems Methodology
Traffic Light Anti-Method
1. Turn all metrics into traffic lights
2. Everything green? No worries, mate.
- Brendan Gregg’s Systems Methodology
🚦
https://github.com/weaveworks/scope
Intuition Engineering:
we need a tool that gives us an intuitive understanding of the entire system at a glance.
http://cloud.weave.works/
Weave Cloud
Hosted version of Weave Scope
Runs on K8s
DEMO
Great, but…• Short-lived connections
• Hairballs, and shifting layouts
scope-probescope-probe
scope-app
Browser
scope-probe
Host 1 Host 2 Host 3
Scope OSS Architecture
Connection Tracking/home/weave # conntrack -E [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41066 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41066 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=36236 dport=32778 src=172.17.0.8 dst=192.168.99.100 sport=80 dport=36236 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41068 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41068 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52996 dport=32776 src=172.17.0.6 dst=192.168.99.100 sport=80 dport=52996 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41070 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41070 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52998 dport=32776 src=172.17.0.6 dst=192.168.99.100 sport=80 dport=52998 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41072 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41072 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=57975 dport=32777 src=172.17.0.7 dst=192.168.99.100 sport=80 dport=57975 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41074 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41074 [ASSURED]
/home/weave # cat /proc/net/tcp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 16810 1 ffff8800d79c1800 100 0 0 10 0 1: 0100007F:EB74 0100007F:0FC8 06 00000000:00000000 03:0000016D 00000000 0 0 0 3 ffff8800ae3f6e80 2: 0100007F:EB69 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 307011 1 ffff8800cf467040 21 4 30 10 -1 3: 0100007F:EB7B 0100007F:0FC8 06 00000000:00000000 03:00000D27 00000000 0 0 0 3 ffff8800d7a47538 4: 0100007F:EB7C 0100007F:0FC8 06 00000000:00000000 03:0000110E 00000000 0 0 0 3 ffff8800cf656c70 5: 0100007F:EB67 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 306868 1 ffff8800d79c1040 21 4 27 10 -1 6: 0100007F:EB76 0100007F:0FC8 06 00000000:00000000 03:00000556 00000000 0 0 0 3 ffff8800d37ac748 7: 0100007F:EB7F 0100007F:0FC8 06 00000000:00000000 03:000014F7 00000000 0 0 0 3 ffff8800d87f0c70
Stop sampling, start listening
EBPF
• user-defined sandboxed kernel programs
• live-instrumentation on vanilla kernel
• listen to connection events same way as conntrack does, but with PID(!)
Have you used D3.js?
Heuristic I: Alignment
Heuristic II: Edge crossings
Heuristic III: Commensurate layout changes
Back to Monitoring Containers
USE vs RED
USE: For every resource, check:
• Utilisation
• Saturation
• Errors
RED: For every service, check:
• Request Rate
• Error rate
• Duration (latency distribution)
*http://www.brendangregg.com/usemethod.html
New data sources
• plugins can add metadata and metrics
• EBPF (ongoing)
• custom instrumentation
Prometheus & K8s
• Kubernetes is already instrumented for Prometheus
• Application-level metrics from instrumentation
Hosted Prometheus, multi-tenant (OSS)
Got a version running in Weave Cloud
Run local Prometheus with a remote destination
https://github.com/weaveworks/prism
We’re hiring!London BerlinSan Francisco
Questions?David Kaltschmidt
@davkals https://weave.works