OSMC 2014: OpenNMS 14 | Tarus Balog
description
Transcript of OSMC 2014: OpenNMS 14 | Tarus Balog
#MonitoringSucks
http://www.adventuresinoss.com
Agenda
•OpenNMS Overview
– History
– Main Feature Areas
– Organization
•OpenNMS 14
– Topology (demo)
– Wall Boards
– Ops Panel
•Automation Demo
•Questions and Answers
OpenNMS is the world's first
enterprise-grade network
management application
platform developed under the
open source model.
“world's first”
•NetSaint 2000-01-10 1323
•OpenNMS 2000-03-29 4141
•Zabbix 2001-03-23 23494
•Nagios 2001-05-03 26589
•RRDTool 2003-01-13 71544
•ZenOSS 2006-03-20 163126
•Icinga 2009-04-21
“enterprise-grade”
• Nearly 63,000 Devices on a One Instance (Swisscom)
• 1.2 Million Data Points Every Five Minutes (New Edge)
• 320,000 Interfaces per Device (Wind)
• 2000 events/sec (SRNS)
• 4000 Remote Monitors (Papa Johns)
“network management application platform”
The Architecture of OpenNMS has been
designed to allow for easy integration of
other tools, both proprietary and open.
“open source model”
OpenNMS is published under the AGPLv3
and all components are licensed under an
OSI-qualified free software license.
•Event Management: including custom events, SNMP traps,
syslog, event translation, automations and correlation.
•Provisioning: Both automated and directed discovery. Fully
supported via ReST.
•Performance Data Collection: SNMP, HTTP, XML, JDBC,
JMX, WMI
•Service Assurance: Service Checks with Outage Models
The Four Main Areas of OpenNMS
opennms.org
opennms.com
• Home of The OpenNMS Group, Inc.
• Provides Services:
– Support
– Consulting
– Custom Development
– Training
– Licensing
opennms.eu
OpenNMS Versions (old)
• Stable (Production) Versions Have an Even Number:
– 1.8
– 1.10
– 1.12
• Unstable (Development) Versions Have an Odd Number:
– 1.7
– 1.9
– 1.11
OpenNMS 14
• Reflects 10+ years of development
• More frequent releases
• Best Release to Date:
– Numerous bug fixes big and small
– Graphical Improvements
Maps! We have Maps! (demo)
Ops Panel and Wall Board
Alarms and Automations
• Alarms exist to
– Reduce similar events
– Perform correlation
• Automations consist of
– Trigger (optional)
– Action
– Event (optional)
• Automations operate mainly on alarms but can access
the whole database
• Used for correlation
Automations Example: Did a script run?
External Script
Script Started
Script Finished
Script Error
Port 5817
O
p
e
n
N
M
S
Step 1: Create a Tracker Alarm
• Trigger: See if a ScriptFinished alarms exists without a
tracker alarm
• Action: NOP
• Event: Create a new event that will generate the
tracker alarm
<automation name="generateTracker" interval="30000" active="true" trigger-name="selectFinishedScriptsNoTracker" action-name="doNothingAction" action-event="createTrackerAlarm" />
selectFinishedScriptsNoTracker
<trigger name="selectFinishedScriptsNoTracker" operator=">=" row-count="1" > <statement> SELECT alarmid AS _alarmid, nodeid AS _nodeid, eventuei AS _eventuei, lasteventtime AS _ts, substring(eventparms from '.*name=(\w+).*') AS _parmname FROM alarms WHERE eventuei='uei.opennms.org/scripts/scriptFinished' AND substring(reductionkey from 'uei.opennms.org/scripts/scriptFinished:(.*)') NOT IN (SELECT substring(reductionkey from '.*scriptTracker:.*:(.*)') FROM alarms WHERE eventuei='uei.opennms.org/scripts/scriptTracker') </statement> </trigger>
doNothingAction
<action name="doNothingAction" > <statement> UPDATE node SET nodeid = -1 WHERE nodeid = -1 </statement> </action>
createTrackerAlarm
<action-event name="createTrackerAlarm" for-each-result="true" > <assignment type="field" name="uei" value="uei.opennms.org/scripts/scriptTracker" /> <assignment type="field" name="nodeid" value="${_nodeid}" /> <assignment type="parameter" name="name" value="${_parmname}" /> <assignment type="parameter" name="alarmId" value="${_alarmid}" /> <assignment type="parameter" name="alarmEventUei" value="${_eventUei}" /> </action-event>
Step 2: Create a Tracker Alarm
• Trigger: See if a new ScriptFinished has arrived
• Action: Update the last event time for the Tracker
• Event: none
<automation name="updateTracker" interval="30000" active="true" trigger-name="selectFinishedScripts" action-name="updateTrackerAlarms" />
selectFinishedScripts
<trigger name="selectFinishedScripts" operator=">=" row-count="1" >
<statement> SELECT alarmid AS _alarmid, nodeid AS _nodeid, lasteventtime AS _lasteventtime, substring(eventparms from '.*name=(\w+).*') AS _parmname, now() AS _ts FROM alarms WHERE eventuei='uei.opennms.org/scripts/scriptFinished' </statement> </trigger>
updateTrackerAlarms
<action name="updateTrackerAlarms" > <statement> UPDATE alarms SET firstautomationtime = COALESCE(firstautomationtime, ${_ts}), lastautomationtime = ${_ts}, lasteventtime = ${_lasteventtime} WHERE eventuei = 'uei.opennms.org/scripts/scriptTracker' AND substring(reductionkey from 'uei.opennms.org/scripts/scriptTracker:(.*)') = ${_nodeid}||':'||${_parmname} </statement> </action>
Step 3: See if the Tracker Alarm is
Updated • Trigger: Select all of the open Tracker alarms and
check if they have been updated.
• Action: NOP
• Event: Create a new event that will generate the
“script did not run” alarm
<automation name="generateScriptDidNotRun" interval="30000" active="true" trigger-name="selectTrackerAlarms" action-name="doNothingAction" action-event="createScriptDidNotRunAlarm" />
selectTrackerAlarms
<trigger name="selectTrackerAlarms" operator=">=" row-count="1" > <statement> SELECT alarmid AS _alarmid, nodeid AS _nodeid, eventuei AS _eventuei, lasteventtime AS _lasteventtime, now() as _ts, substring(eventparms from '.*name=(\w+).*') AS _parmname FROM alarms WHERE eventuei='uei.opennms.org/scripts/scriptTracker' AND lasteventtime < now() - interval '3 minutes' AND substring(reductionkey from 'uei.opennms.org/scripts/scriptTracker:(.*)') NOT IN (select substring(reductionkey from '.*scriptNotRunning:(.*)') FROM alarms WHERE eventuei='uei.opennms.org/scripts/scriptNotRunning') </statement> </trigger>
doNothingAction
<action name="doNothingAction" > <statement> UPDATE node SET nodeid = -1 WHERE nodeid = -1 </statement> </action>
createScriptDidNotRun
<action-event name="createScriptDidNotRunAlarm" for-each-result="true" > <assignment type="field" name="uei" value="uei.opennms.org/scripts/scriptNotRunning" /> <assignment type="field" name="nodeid" value="${_nodeid}" /> <assignment type="parameter" name="name" value="${_parmname}" /> <assignment type="parameter" name="alarmId" value="${_alarmid}" /> <assignment type="parameter" name="alarmEventUei" value="${_eventUei}" /> </action-event>
Looking To the Future: IoT
• Newts: New Time Series Database
• Minion/Dominion:
https://github.com/OpenNMS/smnnepo