with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and...
Transcript of with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and...
![Page 1: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/1.jpg)
Analyzing Performance of OpenStack with Grafana Dashboards
GrafanaCon EU 2018
Alex KrzosSenior Software EngineerMarch 2nd 2018
![Page 2: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/2.jpg)
2
What is OpenStack Example Perf and Scale Analysis
What is the problem? What can be improved?
What is the Monitoring Stack Questions
Dashboards Dashboards DashboardsA dashboard is worth a thousand words
Agenda
![Page 3: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/3.jpg)
3
Alex Krzos
Senior Software Engineer @ Red Hat Inc
On Red Hat OpenStack Scale and Performance Team
2 years on OpenStack
2 years ManageIQ & RH Satellite Performance
4 years at Cisco System/Network testing
Recently completed Army National Guard Obligation
# whoami
![Page 4: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/4.jpg)
4
Open Source Cloud Software - Cloud Operating System
64 official projects, 1582 repos in github.com/openstack
What is OpenStack
![Page 5: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/5.jpg)
5
Major Projects:
● Keystone - Identity● Nova - Compute● Neutron - Networking● Swift - Object Storage● Cinder - Block Storage● Glance - Image Storage● Telemetry - Time-Series Data● Horizon - Dashboard GUI
What is OpenStack
![Page 6: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/6.jpg)
6
Many services, daemons, databases and messaging bus to monitor
Complexity of service interactions
Varying node counts and node types
Large configuration files
Data capture on testbeds
Errors / Misconfiguration
What are the problems?
![Page 7: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/7.jpg)
7
What are the problems? - Complexity
![Page 8: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/8.jpg)
8
What is our Monitoring Stack
![Page 9: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/9.jpg)
9
Benchmarking Tools
![Page 10: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/10.jpg)
10
Deploying the Dashboards via Ansible:
# ansible-playbook -i hosts install/grafana-dashboards.yml
Deploying the Dashboards
![Page 11: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/11.jpg)
11
Originally: .json and .json.j2
Converted to GrafYaml (.yaml, .yaml.j2)
Yaml advantages:
● Less Lines● Less Curly braces and quotes● Comments
GrafYaml Advantages:
● Manages ID / RefID● Defaults reduce lines stored in Yaml● Simple CLI
How Dashboards are Stored
Jinja2 Template Advantages
● Reduce duplication● Reuse by organizing into separate files
GrafYaml+Jinja2 reduced line count:
40,000 lines to 6,500 lines
![Page 12: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/12.jpg)
12
4 Types of Dashboards
● Static Dashboards (.yaml)● Templated “Static” Dashboards ● Templated General Dashboards● Cloud Specific Dashboards (Per-Cloud)
Dashboards, Dashboards, Dashboards
![Page 13: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/13.jpg)
13
Cloud Instance Count
Cloud Total Memory Usage
Cloud System Performance Comparison
Cloud Keystone Token Count
Apache Request Latency
Static Dashboards
![Page 14: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/14.jpg)
14
Static Dashboards - Performance Comparison
![Page 15: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/15.jpg)
15
Cloud Ceph
Cloud Rabbitmq
Three Node Performance
Gnocchi Status
Gnocchi Performance
Templated Dashboards
![Page 16: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/16.jpg)
16
Templated Dashboards
![Page 17: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/17.jpg)
17
Templated General DashboardsNode Types:
● Undercloud● Controller● Ceph Storage● Block/Object Storage● Computes
![Page 18: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/18.jpg)
18
Exposes Each Node in
Ansible Inventory
Includes:
● CPU● Memory● Disk● Network● Log
Cloud Specific Dashboards - CPU
![Page 19: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/19.jpg)
19
Cloud Specific Dashboards - Memory
![Page 20: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/20.jpg)
20
Cloud Specific Dashboards - Disk
![Page 21: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/21.jpg)
21
Cloud Specific Dashboards - Network
![Page 22: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/22.jpg)
22
Cloud Specific Dashboards - Log
![Page 23: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/23.jpg)
Example Performance and Scale Analysis
23
![Page 24: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/24.jpg)
24
Example - OpenStack Telemetry ScalingOpenStack Pike OpenStack Ocata
![Page 25: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/25.jpg)
25
Example - Slow API PostThreaded Batch
![Page 26: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/26.jpg)
26
Example - Coordination LossDaemon lost contact with coordination service
![Page 27: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/27.jpg)
27
Example - SMIs using more CPUOvercloud-compute-4 has 480 SMIs every 10s resulting in higher CPU util, Set “OS Control” in your BIOS power settings...
![Page 28: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/28.jpg)
28
Example - Uneven MemoryVisualize distribution of instances in uneven compute memory environment
![Page 29: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/29.jpg)
29
Example - Boot TimingsUnpatched vs Patched Boot Scenario
![Page 30: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/30.jpg)
30
Dashboard Storage / Reuse
Queries for different Data Sources
Preserving Data (Snapshotting)
Time Series Data Units
Standard Time Series Benchmarks
Metrics interval
What can be improved?
![Page 31: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/31.jpg)
Questions?
![Page 32: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/32.jpg)
32
Browbeat - https://github.com/openstack/browbeat
GrafYaml - https://github.com/openstack-infra/grafyaml
Ansible - https://github.com/ansible/ansible
OpenStack - https://www.openstack.org/
References / Links
![Page 33: with Grafana Dashboards Analyzing Performance of OpenStack · 2 What is OpenStack Example Perf and Scale Analysis What is the problem? What can be improved? What is the Monitoring](https://reader034.fdocuments.us/reader034/viewer/2022042103/5e803f93af18f91224057a01/html5/thumbnails/33.jpg)
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews