service level monitoring with nagios
Transcript of service level monitoring with nagios
![Page 1: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/1.jpg)
Service Level Monitoring with
Nagios
Athanasios Douitsis Dimitrios Kalogeras
National Technical University of Athens Network Operations Center
![Page 2: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/2.jpg)
Service Level Monitoring with Nagios
Outline
● Introduction● Design & Architecture● Implementation● Conclusions
![Page 3: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/3.jpg)
Service Level Monitoring with Nagios
SLA metrics
● Dependence of modern applications (e.g. VoIP)on characteristics such as – Packet Round Trip Time.– Jitter.– Packet Loss.– One way delay.
● Definition of Service Availability and SLA's based on these qualities.
![Page 4: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/4.jpg)
Service Level Monitoring with Nagios
Aspects of SLA monitoring
● Execution of measurements using Network Probes.
● Collection of data.● Processing.● Presentation of information.● Integration with Network Management
Systems.
![Page 5: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/5.jpg)
Service Level Monitoring with Nagios
Goals● Accuracy of measurements.● Easy deployment of probes.● Low installation cost.● Simplified operation.● Efficient communication.● Resiliency of data path to outages.● Standards compliance.● Modularity and reusability.
![Page 6: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/6.jpg)
Service Level Monitoring with Nagios
Architecture Overview
![Page 7: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/7.jpg)
Service Level Monitoring with Nagios
Probe ImplementationCisco Service Assurance Agent (IOS feature).● Definition of functional units (Probes).● Scheduling of probes.● Measurement of Round Trip Time, RTT
variance (jitter), one way delay, packet loss.● Aggregation of measurements and inclusion
of statistical properties.● Data available though the SAA-MIB for up to
24 hours back.
![Page 8: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/8.jpg)
Service Level Monitoring with Nagios
Data Collector● Gathering of measurement data from SAA capable
devices.● Storing of data in a database.● Usage of SNMP for gathering phase.● Designed to run unattended
– Ability to detect reloads and outages.– Ability to detect new probes as they appear.
● Many instances can work together.
![Page 9: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/9.jpg)
Service Level Monitoring with Nagios
Database Schema● Tag registry.
– Probe ID (router, probe tag).– Probe characteristics.
● Echo probes table.● Jitter probes table.● Indexed by probe ID and time to optimise
large queries.
![Page 10: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/10.jpg)
Service Level Monitoring with Nagios
Nagios ● Network and Service monitoring tool.● Concept of hosts and services.● Web interface, mail alerts, paging.● Extensible architecture through service
checkers (Nagios Plugins).● Plugins == external commands.● Well known plugin interface.
![Page 11: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/11.jpg)
Service Level Monitoring with Nagios
Integration with Nagios● Nagios plugin capable of reading from the
measurement database.● Comparison against predefined metrics such as
– Downtime calculation.– Jitter thresholds tolerable for VoIP operation.
● Monitoring of SLA's on hourly, daily, weekly, monthly and yearly basis.
● Fine grained monitoring for work hours, work days and calendar periods.
![Page 12: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/12.jpg)
Service Level Monitoring with Nagios
SAA probe details● Usage of ICMP echo and Jitter probes on
routers.● Requirement for the SAA responder feature in
order for the Jitter probes to function.● Probe configuration: 10 • 1000bit packets
every minute for each probe.● Hourly aggregation of measurements.● Back store tuned to 3 hours to minimise
memory usage.
![Page 13: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/13.jpg)
Service Level Monitoring with Nagios
Probe deployment strategy● Standard SAA configuration same for all
selected devices to homogenise deployment.● Echo every router in the network (All pairs).● Jitter every router in the network (All pairs).● Probe configuration installed on all routers
capable of accurate measurements.● Responder installed everywhere.● Configuring and starting of probes using TFTP.
![Page 14: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/14.jpg)
Service Level Monitoring with Nagios
Collector internals
1.Establishment of SNMP session to the target device.
2.Discovery of available probes (by their tag) and updating of Tag registry accordingly.
3.For each individual probe:1. Gathering of available hour sets.2. Insertion of sets that have not been inserted yet
based on their creation time.3. Updating of the timestamps in the tag registry.
![Page 15: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/15.jpg)
Service Level Monitoring with Nagios
Nagios plugin usage● Reading from database.● Calculation of metric.● Comparison with threshold.● Same plugin for both echo and jitter probes.● Pluggable time queries.● Pluggable metric queries (downtime, jitter etc)
![Page 16: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/16.jpg)
Service Level Monitoring with Nagios
Experience● Deployment extremely easy, low administration
overhead.● Integration with existing Nagios installation
beneficial.● Combined collector, database and plugin activity
light in terms of processing load.But:● Various SAA bugs.
![Page 17: service level monitoring with nagios](https://reader031.fdocuments.us/reader031/viewer/2022020203/586e0c181a28abfe5f8b64df/html5/thumbnails/17.jpg)
Service Level Monitoring with Nagios
Future ideas
● Ditch SAA, use various MIB's from DISMAN working group.
● Measurements available through web service.● Usage of NMWG XML schema.