OUCE2013-RBEM-PT

66
Rule Based Event Management Presentation 2013-03-12 / Version: 1.1.2 created with [email protected]

description

OpenNMS - Rule Based Event Management Presentation

Transcript of OUCE2013-RBEM-PT

Page 1: OUCE2013-RBEM-PT

Rule Based Event ManagementPresentation 2013-03-12 / Version: 1.1.2

created with

[email protected]

Page 2: OUCE2013-RBEM-PT

OUCE 2013 2

AgendaAgenda

➢OpenNMS Event Management Drools Platform Overview Drools Rule Basics Activation of Drools More Information

Page 3: OUCE2013-RBEM-PT

OUCE 2013 3

OpenNMS Event ManagementOpenNMS Event Management

DuplicateDetection

- Update Alarms

- Clear Alarms

- Run Automations

Correlation /Automation

Validation /Mapping

TroubleTicketing

Notification Escalation

- Incident is not solved in the estimated Time

- Open Incident Ticket

- Point to the Root- Cause

- Generate Alarms

- Find Duplicates

- Perform validation

- Is event defined?

- Notify that action is required

Event

Event Flow (Best Practice)

Event Event EventAlarm Event Alarm

Page 4: OUCE2013-RBEM-PT

OUCE 2013 4

OpenNMS Event ManagementOpenNMS Event Management

DuplicateDetection

- Update Alarms

- Clear Alarms

- Run Automations

Correlation /Automation

Validation /Mapping

TroubleTicketing

Notification Escalation

- Incident is not solved in the estimated Time

- Open Incident Ticket

- Point to the Root- Cause

- Generate Alarms

- Find Duplicates

- Perform validation

- Is event defined?

- Notify that action is required

Event Event Event EventAlarm Event AlarmFocus of this presentation

Event Flow (Best Practice)

Page 5: OUCE2013-RBEM-PT

OUCE 2013 5

What is an Event?What is an Event?

Indication of something that has happend Two types of events:

Internal: Management of OpenNMS External: Management of IT-Operations

Events are defined in eventconf.xml Events can have different properties Events are received on port 5817 / REST Client scripts: 'send-event.pl' / 'send-trap.pl'

Page 6: OUCE2013-RBEM-PT

OUCE 2013 6

Event AnatomyEvent Anatomy

  <event>

      <uei/>

      <event­label/> 

      <descr/>

      <logmsg/>

      <severity/>

      <alarm­data/>

      <operinstruct/>

      <mouseovertext/>

      <autoaction/> 

  </event>

Unique Universal Event Identifier:uei.opennms.org/webserver/down

7x SeveritiesDefines the

Event – Alarm Relation

Runs Action

Page 7: OUCE2013-RBEM-PT

OUCE 2013 7

Nodelabel:sun

Node DiscoveryNode Discovery

$> send­event.pl ­i 127.0.0.1 ­s Discovery \ 

   ­p "nodelabel sun" \

   uei.opennms.org/internal/capsd/addInterface \     ­x 4

Severity:Warning

Event Type:Internal Event

Interface

Page 8: OUCE2013-RBEM-PT

OUCE 2013 8

Event ViewEvent View

Node ID #5

Page 9: OUCE2013-RBEM-PT

OUCE 2013 9

What is an Alarm?What is an Alarm?

Alarms are generated by Events

Reduction-key identifies the Event as an Alarm

Alarms are processed by Alarmd

Three types of Alarms:

"1" - to be a problem that has a possible resolution "2" - to be a resolution event "3" - for events that have no possible resolution

Events are linked to Alarms

Cleared Alarms are removed automatically from the DB

Page 10: OUCE2013-RBEM-PT

OUCE 2013 10

Alarm AnatomyAlarm Anatomy

 <event>      <uei/>      ...       <alarm­data          reduction­key="%uei%:%nodeid%:%parm[#2]%"          alarm­type="2"          clear­key="uei.opennms.org/dbserver/down:

  %nodeid%:%parm[#2]%"          auto­clean="true"/>          <update­field field­name="severity">          <update­field field­name="logmsg" 

          update­on­reduction="false"/>       </alarm­data> </event>

DuplicationDetection Rule

ClearingRule

GroomingRule

ChangeRule

Page 11: OUCE2013-RBEM-PT

OUCE 2013 11

Alarm RulesAlarm Rules

Reduction Key

It's used for event duplication detection (repeat count)

The granularity determines the amount of reduction

Clear Key

Used in case of a resolution (alarm-type=2)

Resolution alarm's clear-key has to match the problem alarm's reduction-key

Page 12: OUCE2013-RBEM-PT

OUCE 2013 12

Alarm RulesAlarm Rules

Update Field

Allow updates to a few specific alarm fields (i.e. severity)

lastEventId, lastEventTime, logMsg, and eventParms

are updated by default

Auto Cleaning

All previous events matching the reduction key of the

current event will be removed from the DB

Page 13: OUCE2013-RBEM-PT

OUCE 2013 13

Alarm ViewAlarm View

Acknowledge Alarms Clear Alarms Escalate Alarms

Page 14: OUCE2013-RBEM-PT

OUCE 2013 14

Event ProcessingEvent Processing

„There is a default automation that deletes unacknowledged alarms whose severity is

'Cleared', so if you want an alarm to go away, it should be cleared and unacknowledged“

Jeff Gehlbach / OpenNMS

Page 15: OUCE2013-RBEM-PT

OUCE 2013 15

Event CategoriesEvent CategoriesProblem EventA problem event precedes another event in a sequence. It is most likely the cause of an symptom event that arrives later, assuming they are related to the same component.Resolution EventA resolution event indicates the return to a typical state, thus canceling a problem state. When a resolution eventis received, processing should clear any related problemevents.

Symptom EventA symptom event is a symptom of some other problem.The cause of a problem might not always be known when a symptom event is received.

Page 16: OUCE2013-RBEM-PT

OUCE 2013 16

Simple Event SequenceSimple Event Sequence

Problem Eventuei.opennms.org/webserver/down

Resolution Eventuei.opennms.org/webserver/up

Page 17: OUCE2013-RBEM-PT

OUCE 2013 17

Xzample.events.xmlXzample.events.xml

Create Xzample.events.xml

$> touch $OPENNMS_HOME/etc/events/Xzample.events.xml 

Page 18: OUCE2013-RBEM-PT

OUCE 2013 18

Add a Problem EventAdd a Problem Event<event>   <uei>uei.opennms.org/webserver/down</uei>   <event­label>Webserver Down</event­label>   <descr>     &lt;p&gt;%parm[subSource]% ­      Status 503 Service Unavailable&lt;/p&gt;   </descr>   <logmsg dest='logndisplay'>     &lt;p&gt;SubSource: %parm[subSource]% is down ­     Source: %parm[source]%&lt;/p&gt;   </logmsg>   <severity>Warning</severity>   <alarm­data reduction­key="%uei%:%nodeid%:%service%"     alarm­type="1"     auto­clean="false" /></event>

to Xzample.events.xmlto Xzample.events.xml

Page 19: OUCE2013-RBEM-PT

OUCE 2013 19

Add a Resolution EventAdd a Resolution Event<event>  <uei>uei.opennms.org/webserver/up</uei>  <event­label>Webserver Up</event­label>  <descr>    &lt;p&gt;%parm[subSource]% ­ Status 200 OK&lt;/p&gt;  </descr>  <logmsg dest='logndisplay'>     &lt;p&gt;SubSource: %parm[subSource]% is up ­      Source: %parm[source]%&lt;/p&gt;  </logmsg>  <severity>Normal</severity>  <alarm­data reduction­key="%uei%:%nodeid%:%service%"    alarm­type="2"    clear­key="uei.opennms.org/webserver/down:%nodeid%:%service%"    auto­clean="true"/></event>

to Xzample.events.xmlto Xzample.events.xml

Page 20: OUCE2013-RBEM-PT

OUCE 2013 20

Problem & Resolution EventProblem & Resolution Event

Problem Eventreduction­key="%uei%:%nodeid%:%service%"

Resolution Eventclear­key="uei.opennms.org/webserver/down:%nodeid%:%service%" 

clear-key == reduction-key

Page 21: OUCE2013-RBEM-PT

OUCE 2013 21

Extend eventconf.xmlExtend eventconf.xml

Add the following line to the end of eventconf.xml

$> echo \

'<event­file>events/Xzample.events.xml \

 </event­file>' >> \

$OPENNMS_HOME/etc/eventconf.xml

Reload eventconf.xml

$> $OPENNMS_HOME/bin/send­event.pl \

   uei.opennms.org/internal/eventsConfigChange

Page 22: OUCE2013-RBEM-PT

OUCE 2013 22

Send a Problem EventSend a Problem Event

parm[#2]

parm[#1]

$> ./send­event.pl ­n 5 ­s Http  

   ­p "source send­event.pl" 

   ­p "subSource webserver1" 

   uei.opennms.org/webserver/down ­x 7

Node ID #5(in this case) Service

Http

Severity:Critical

Page 23: OUCE2013-RBEM-PT

OUCE 2013 23

Event ViewEvent View

Page 24: OUCE2013-RBEM-PT

OUCE 2013 24

Alarm ViewAlarm View

Page 25: OUCE2013-RBEM-PT

OUCE 2013 25

Send a Resolution EventSend a Resolution Event

parm[#2]

parm[#1]

$> ./send­event.pl ­n 5 ­s Http \

   ­p "source send­event.pl" \

   ­p "subSource webserver1"

   uei.opennms.org/webserver/up ­x 3

Node ID #5(in this case) Service

Http

Severity:Normal

Page 26: OUCE2013-RBEM-PT

OUCE 2013 26

Event ViewEvent View

Page 27: OUCE2013-RBEM-PT

OUCE 2013 27

Alarm ViewAlarm View

Page 28: OUCE2013-RBEM-PT

OUCE 2013 28

Complex Event SequenceComplex Event Sequence

Problem EventWebserver2 Down

Problem EventWebserver1 Down

Resolution EventWebserver1 Up

Resolution EventWebserver2 Up

Symptom EventCarDirectDown

Resolution EventCarDirectUp

Workshop Preview - DroolsWorkshop Preview - Drools

Page 29: OUCE2013-RBEM-PT

OUCE 2013 29

AgendaAgenda

OpenNMS Event Management

➢Drools Platform Overview Drools Basic Rules Activation of Drools More Information

Page 30: OUCE2013-RBEM-PT

OUCE 2013 30

Drools Platform OverviewDrools Platform Overview

Business Logic Integration Platform

Expert Fusion jBPM 5 Planner Guvnor UberFire

sou

rce

: h

ttp

://d

e.s

lide

sha

re.n

et/

ma

riofu

sco

/intr

od

uci

ng

-dro

ols

Page 31: OUCE2013-RBEM-PT

OUCE 2013 31

Expert & FusionExpert & Fusion

Expert

Basic rule engine – core of the business logic integration platform

Operates on set of data (facts)

Fusion

Can define relationships between facts over the time Supports: CEP/ESP, sliding windows, temporal

operators

Page 32: OUCE2013-RBEM-PT

OUCE 2013 32

jBPM5 & PlannerjBPM5 & Planner

jBPM5

Flexible and lightweight Business Process Management (BPM) tool

Can be integrated with almost all the other modules Authoring tool: jBPM5 BPMN2 Eclipse editor

Planner

Used to optimize automated planning problems Combines search algorithm with the core of the

rule engine

Page 33: OUCE2013-RBEM-PT

OUCE 2013 33

Guvnor & UberFire Guvnor & UberFire

Guvnor

Repository for Drools Knowlege Bases Web based Gui Version management

UberFire

Uberfire is an Eclipse like workbench (web based), built of GWT, Errai and CDI

New Project

Page 34: OUCE2013-RBEM-PT

OUCE 2013 34

AgendaAgenda

OpenNMS Event Management Drools Platform Overview

➢Drools Rule Basics Activation of Drools More Information

Page 35: OUCE2013-RBEM-PT

OUCE 2013 35

Drools Rule BasicsDrools Rule Basics

Business Logic Integration Platform

Expert Fusion jBPM 5 Planner Guvnor UberFire

sou

rce

: h

ttp

://d

e.s

lide

sha

re.n

et/

ma

riofu

sco

/intr

od

uci

ng

-dro

ols

Page 36: OUCE2013-RBEM-PT

OUCE 2013 36

Rule Engine Rule Engine

ProductionMemory(rules)

WorkingMemory(facts)

Inference Engine(ReteOO/Leaps)

PatternMatcher

Agenda

Rule File: NodeParentRules.drlRule: "Webserver Down"

Rule is triggered by facts - event(s):"uei.opennms.org/webserver/down"

Inference Engine(ReteOO/Leaps)

sou

rce

: h

ttp

://d

ocs

.jbo

ss.o

rg/d

roo

ls/r

ele

ase

/5.5

.0.F

ina

l/dro

ols

-exp

ert

-d

ocs

/htm

l_si

ng

le/in

de

x.h

tml#

d0

e1

28

Page 37: OUCE2013-RBEM-PT

OUCE 2013 37

Rule FileRule File

Text file with a .drl extension Package declaration must be the first element DRL file contains:

multiple rules, queries & functions, imports, globals and attributes

Rules can be spread across multiple rule files

Page 38: OUCE2013-RBEM-PT

OUCE 2013 38

Rule File AnatomyRule File Anatomy

 package package­name

 imports

 globals

 functions

 queries

 rules

Page 39: OUCE2013-RBEM-PT

OUCE 2013 39

Rule AnatomyRule Anatomy

rule “<name>”<attribute> <value>

  when<LHS>

then<RHS>

Quotes on Rule names are optionalif the rule name has no spaces

CONDITION:Pattern-matching

against objects in theWorking Memory salience(priority) <int>

agenda-group <string>no-loop <boolean>auto-focus <boolean>duration <long>...

CONSEQUENCE:Code executed when

a match is found

insp

ried

by:

htt

p:/

/de

.slid

esh

are

.ne

t/m

ario

fusc

o/in

tro

du

cin

g-d

roo

ls

Page 40: OUCE2013-RBEM-PT

OUCE 2013 40

What is a condition/pattern?What is a condition/pattern?

Event( uei == „uei.opennms.org/webserver/down”)

Pattern

Object Type Field Constraint

Field Name Restriction

insp

ried

by:

htt

p:/

/de

.slid

esh

are

.ne

t/m

ario

fusc

o/in

tro

du

cin

g-d

roo

ls

Page 41: OUCE2013-RBEM-PT

OUCE 2013 41

Rule FactsRule Facts// Java

public class Event {  private String uei;  private int severity;  private int priority;  private Sting message;  // getter and setter here

  }

// DRLdeclare Event

uei : Stringseverity : intpriority : intmessage : String

end

// Rulerule "Change Priority" no­loopwhen

$event : Event( severity == 7 );then

modify( $event ) { priority = 1 };end

insp

ried

by:

htt

p:/

/de

.slid

esh

are

.ne

t/m

ario

fusc

o/in

tro

du

cin

g-d

roo

ls

Page 42: OUCE2013-RBEM-PT

OUCE 2013 42

Rule ConsequenceRule Consequenceinsert()

For inserting new facts into the session:

insert( new Event() );

modify()

For updating existing facts in the session:

modify( $event ) { priority = 1 };

retract()

For removing existing facts from the session:

retract( $event );

Methods for Handling Facts Methods for Handling Facts

Page 43: OUCE2013-RBEM-PT

OUCE 2013 43

'…' must be replaced with 'uei.opennms.org'

Rule SyntaxRule SyntaxString:

Event( uei == ".../webserver/down")

Regular expression:

Event( uei matches ".*nodeDown" )

Date:

Event( createTime > "13­Mar­2013" ) //  "dd­mmm­yyyy"

Boolean:

Event( isAcknowledged == true )

Enum:

Event ( type == Event.Type.CRITICAL )

TypesTypes

Page 44: OUCE2013-RBEM-PT

OUCE 2013 44

Rule SyntaxRule SyntaxAnd:

Event(uei == ".../webserver/down", 

      severity < 6)

Or:

Event(uei == ".../webserver/down" ||

severity < 6)

Not:

not Event(uei == ".../webserver/down")

Exists:

exists Event( uei matches "[A­Z][a­z]+" )

Conditions / PatternConditions / Pattern

Page 45: OUCE2013-RBEM-PT

OUCE 2013 45

Rule SyntaxRule SyntaxVariables

  Rules can declare variables as follows:

  $event : Event( $uei : uei )

Comments

#  single line comment

// single line comment

/* multi line

   comment */

Variables / CommentsVariables / Comments

Page 46: OUCE2013-RBEM-PT

OUCE 2013 46

Rule SyntaxRule SyntaxPackage

Group of related rules

package org.opennms.netmgt.correlation.drools;

Imports

Have the same purpose as standard Java imports

import org.opennms.netmgt.xml.event.Event;

import org.opennms.netmgt.model.events.EventBuilder;

Package / ImportsPackage / Imports

Page 47: OUCE2013-RBEM-PT

OUCE 2013 47

Rule SyntaxRule SyntaxFunctions

Can be used in conditions and consequences

function void println(Object msg) {

   System.out.println(new Date() + " : " + msg);

}

Dialect

Specifies the syntax used in any code expression

Default value is Java

Drools supports one more dialect called mvel

Dialect can be set on package or rule level

Functions / DialectFunctions / Dialect

Page 48: OUCE2013-RBEM-PT

OUCE 2013 48

Timers & Calendars Timers & Calendars rule 'Change Severity'timer 5m30swhen

$evt : Event( acknowledged == false )then   modify( $evt ) { acknowledged = true};end

insp

ried

by:

htt

p:/

/de

.slid

esh

are

.ne

t/m

ario

fusc

o/in

tro

du

cin

g-d

roo

ls

rule 'Maintenance Mode'calendars "weekend"when

$evt : Event()then   retract($evt);end

rule 'Send AutoTask Event'timer (cron: 0/5 * * * * *)when

Event()then   sendEvent();end

When the event is 'unack.', and has been 'unack.' for 5m30s then'ack' it.

Drop events on

weekends

Send Event every

five seconds

Page 49: OUCE2013-RBEM-PT

OUCE 2013 49

AgendaAgenda

OpenNMS Event Management Drools Platform Overview Drools Rule Basics

➢Activation of Drools More Information

Page 50: OUCE2013-RBEM-PT

OUCE 2013 50

Activation of DroolsActivation of Drools

Drools is part of the correlation engine Correlation engine is not activated by default Drools needs to be configured OpenNMS comes with:

example Configurations example Rules

OpenNMS uses Drools version: 5.1.1

Page 51: OUCE2013-RBEM-PT

OUCE 2013 51

Configuration Configuration

(1) Go to the opennms example directory

$> cd $OPENNMS_HOME/etc/examples

(2) Copy all example configurations and rules

$> cp correlation­engine.xml \

    drools­engine.xml \

    LocationMonitorRules.drl \

    NodeParentRules.drl \

    nodeParentRules­context.xml \

    $OPENNMS_HOME/etc

Page 52: OUCE2013-RBEM-PT

OUCE 2013 52

ConfigurationConfiguration

(3) Edit service-configuration.xml

uncomment the service named “OpenNMS:Name=Correlator”in $OPENNMS_HOME/etc/service-configuration.xml

(4) Restart opennms

$> sudo service opennms restart

(5) Check 'spring.log'

$> grep 'drools­correlation­engine'      $OPENNMS_HOME/logs/daemon/spring.log

2013­02­02 09:23:05,854 INFO  [Main] XmlBeanDefinitionReader: Loading XML bean definitions from URL [jar:file:/usr/share/opennms/lib/drools­correlation­engine­1.10.2.jar!/META­INF/opennms/correlation­engine.xml]

Page 53: OUCE2013-RBEM-PT

OUCE 2013 53

Event Relationship ExampleEvent Relationship Example

Symptom Event…/webserver/down

Problem EventnodeDown

Resolution EventnodeUp

webserver eventsare createdby Drools

Resolution Event…/webserver/up

Page 54: OUCE2013-RBEM-PT

OUCE 2013 54

Extend NodeParentRules.drlExtend NodeParentRules.drl

(1) Add function sendEvent to NodeParentRules.drl

 function void sendEvent(DroolsCorrelationEngine engine,String uei, Long nodeId,  String svcName, String subSource) {        EventBuilder bldr = new             EventBuilder(uei,"Drools")           .setNodeid(nodeId.intValue())           .setService(svcName)           .addParam("source","Drools")           .addParam("subSource",subSource);        engine.sendEvent(bldr.getEvent());}

Page 55: OUCE2013-RBEM-PT

OUCE 2013 55

Extend NodeParentRules.drlExtend NodeParentRules.drl

(2) Add 'Webersever Down' rule to NodeParentRules.drl

 rule "Webserver Down"      salience 766      when          Event( uei matches ".*nodeDown",                  descr matches ".*503",                  $nodeid: nodeid )      then          sendEvent(engine,                 "uei.opennms.org/webserver/down",                 $nodeid,"Http","webserver1",                 "Critical");          println("­­­> Webserver Down Event");end

Page 56: OUCE2013-RBEM-PT

OUCE 2013 56

Extend NodeParentRules.drlExtend NodeParentRules.drl

(3) Add 'Webersever Up' rule to NodeParentRules.drl

 rule "Webserver Up"      salience 777      when          Event( uei matches ".*nodeUp",                 descr matches ".*200",                 $nodeid: nodeid )      then             sendEvent(engine,                   "uei.opennms.org/webserver/up",                    $nodeid,"Http","webserver1",                   "Normal");         println("­­­> Webserver Up Event");end

Page 57: OUCE2013-RBEM-PT

OUCE 2013 57

Restart & Send EventRestart & Send Event

(4) Restart OpenNMS

$> sudo service opennms restart

(5) Send problem event

$> ./send­event.pl ­n 5 ­d “Status: 503”

     uei.opennms.org/nodes/nodeDown ­x 4

Page 58: OUCE2013-RBEM-PT

OUCE 2013 58

Event ViewEvent View

Page 59: OUCE2013-RBEM-PT

OUCE 2013 59

Alarm ViewAlarm View

Page 60: OUCE2013-RBEM-PT

OUCE 2013 60

Send Resolution EventSend Resolution Event

(6) Send 'nodeUp' Event

$> ./send­event.pl ­n 5 ­d "Status: 200"     uei.opennms.org/nodes/nodeUp ­x 3

Page 61: OUCE2013-RBEM-PT

OUCE 2013 61

Event ViewEvent View

Page 62: OUCE2013-RBEM-PT

OUCE 2013 62

Alarm ViewAlarm View

Page 63: OUCE2013-RBEM-PT

OUCE 2013 63

Log FileLog File

$OPENNMS_HOME/logs/daemon/output.log

Page 64: OUCE2013-RBEM-PT

OUCE 2013 64

AgendaAgenda

OpenNMS Event Management Drools Platform Overview Drools Rule Basics Activation of Drools

➢More Information

Page 65: OUCE2013-RBEM-PT

OUCE 2013 65

More InformationMore Information

Presentation http://de.slideshare.net/mschneider73

OpenNMS http://www.opennms.org/wiki/Events#Events_and_Alarms http://www.opennms.org/wiki/Drools_Correlation_Engine

Drools http://docs.jboss.org/drools/release/5.2.0.Final/drools-expert-docs/

html/ch05.html http://www.jboss.org/drools/presentations http://mvel.codehaus.org

Page 66: OUCE2013-RBEM-PT

03/14/13 OUCE 2013 66

Comments & QuestionsThank you for your attention

Contact details:

[email protected]

www.rapideca.org