Remote Network Monitoring : Alarms and Filters
Transcript of Remote Network Monitoring : Alarms and Filters
Remote Network Monitoring :Alarms and FiltersAlarms and Filters
ChuChuChuChu----Sing YangSing YangSing YangSing Yang
Department of Electrical EngineeringDepartment of Electrical EngineeringDepartment of Electrical EngineeringDepartment of Electrical Engineering
National Cheng Kung UniversityNational Cheng Kung UniversityNational Cheng Kung UniversityNational Cheng Kung University
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Introduction
� Last chapter describes the partial groups in RMON � statistics, history, host, hostTopN, matrix, tokenRing � Concerned with the collection of traffic statistics
This chapter looks at the remainder of the RMON � This chapter looks at the remainder of the RMON groups� alarm group� filter group� Packet capture group� event group
The RMON MIB� Statistics
� Maintains low-level utilization and error statistics for each subnetwork monitored by the agent
� Provides information about the load on a subnetwork and the overall health of the subnetwork
� History� Defines sampling functions for one or more of the interfaces of the monitor� Records periodic statistical samples from information available in the statistics group� Records periodic statistical samples from information available in the statistics group
� Alarm� Allows the manager to set a sampling interval and alarm threshold for any counter or
integer recorded by the RMON probe� Host
� Used to gather statistics about specific hosts on the LAN� Contains counters for various types of traffic to and from hosts attached to the
subnetwork� hostTopN
� Used to maintain statistics about the set of hosts on one subnetwork that top a list based on some parameter
� Contains sorted host statistics that report on the hosts that top a list based on some parameter in the host table
The RMON MIB (cont.)� Matrix
� Used to record information about the traffic between pairs of hosts on a subnetwork
� Shows error and utilization information in matrix form
� Filter� Allows the monitor to observe packets that match a filter
� Packet capture� Governs how data is sent to a management console
� Event� Gives a table of all events generated by the RMON probe
� tokenRing� Maintains statistics and configuration information for token ring
subnetwork
The RMON MIB (cont.)
� Dependencies� The alarm group requires the implementation of the event
group� The hostTopN group requires the implementation of the � The hostTopN group requires the implementation of the
host group� The packet capture group requires the implementation of
the filter group� Collection of traffic statistics for one or more subnetworks
� Statistics, history, host, hostToN, matrix, and tokenRing
� Concern with various alarm conditions and packet filtering� Alarm, filter, packet capture, event
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Alarm Group� Used to define a set of thresholds for network
performance� If a threshold is crossed, an alarm is generated and sent
to the central control� An alarm is generated if there are more than 500 CRC errors � An alarm is generated if there are more than 500 CRC errors
(the threshold) in any 5-minute period (the sampling interval)� Consists of a single table, alarmTable
� Each entry in the table specifies � a particular variable to be monitored� a sampling interval� threshold parameters
� When the current sampling interval is completed, the new value for the sampled variable will be stored and the old value is lost
alarmIndex (1)
alarm Interval(2)
alarmVariable (3)
alarmSampleType (4)
alarmEntry (1)
alarmTable (1)
alarm (rmon3)
RMON alarm Group
←
alarmOwner (11)
alarmRisingEventIndex (9)
alarmFallingThreshold (8)
alarmRisingThreshold (7)
alarmValue (5)
alarmStartupAlarm (6)
alarmStatus (12)
alarmFallingEventIndex (10)
←
alarmTable� alarmIndex
� An integer that uniquely identifies a row in the alarmTable� Each row specifies a sample at a particular interval for a particular object in
the monitor’s MIB� alarmInterval
� The interval over which the data are sampled and compared with the rising � The interval over which the data are sampled and compared with the rising and falling thresholds
� alarmVariable� The object identifier of the particular variable in the RMON MIB to be
sampled� The object types are INTEGER, counter, gauge, and TimeTicks
� alarmSampleType� The method of calculating the value to be compared to the thresholds
� absoluteValue(1): the value of the selected variable is compared directly with the thresholds
� deltaValue(2): the value of the selected variable at the last sample is subtracted from the current value, and the difference is compared to the thresholds
alarmTable� alarmValue
� Gives the value of the statistics during the last sampling period
� alarmStartupAlarm� Dictates whether an alarm will be generated
� if the first sample after the row becomes valid is greater than or equal to the risingThreshold, less than or equal to the fallingThreshold, or both, respectivelyrisingThreshold, less than or equal to the fallingThreshold, or both, respectively
� Has the values: risingAlarm(1),fallingAlarm(2), risingOrFallingAlarm(3)
� alarmRisingThreshold� Is the rising threshold for the sampled statistic
� alarmFallingThreshold� Is the falling threshold for the sampled statistic
� alarmRisingEventIndex� The index of the eventEntry that is used when the rising threshold is crossed
� alarmFallingEventIndex� The index of the eventEntry that is used when the falling threshold is crossed
Operation of Alarm Scheme� The monitor or a manager defines a new alarm in the alarmTable
� Creates a new row with combination of variable, sampling interval, and threshold parameters
� Two thresholds are provided� Rising threshold
� Is crossed if the current sampled value is greater than or equal to the rising threshold � Is crossed if the current sampled value is greater than or equal to the rising threshold and the value at the last sampling interval was less than the threshold
� Falling threshold� Is crossed if the current sampled value is less than or equal to the falling threshold
and the value at the last sampling interval was greater than the threshold
� Two types of values are calculated for alarm� absoluteValue
� Is simply the value of an object at the time of sampling� deltaValue
� Is the difference in values for the object over two successive sampling periods� A rate of change
Rising-alarm Events Rules
1. (a)If the first sampled value obtained after the row becomes valid is less than the rising threshold, then a rising-alarm event is generated the first time that the sample value becomes greater than or equal to the rising threshold
(b)If the first sampled value obtained after the row becomes valid is greater than or equal to the rising threshold, and if the value of alarmStartupAlarm is risingAlarm(1) or ringOrFallingAlarm(3), then a rising-alarm event is risingAlarm(1) or ringOrFallingAlarm(3), then a rising-alarm event is generated
(c)If the first sampled value obtained after the row becomes valid is greater than or equal to the rising threshold, and if the value of alarmStartupAlarm is FallingAlarm(2), then a rising-alarm event is generated the first time that the sample value again becomes greater than or equal to the rising threshold after having fallen below the rising threshold
2. After a rising-alarm event is generated, another such event will not be generated until the sampled value has falling below the rising threshold, reached the falling threshold, and then subsequently reached the rising threshold again
Generation of Alarm Events
Hysteresis Mechanism
� The mechanism by which small fluctuations are prevented from causing alarms
Double Sampling Rule
� 10-second intervals� The rising threshold is 20� No alarm is triggered
� 5-second intervals� At t=15, meets the rising-
alarm threshold and triggered a rising-alarm event
Time(t) 0 10 20Observed value 0 19 32Delta value 0 19 13
a rising-alarm event
Time(t) 0 5 10 15 20Observed value 0 10 19 30 32Delta value 0 10 9 11 2
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Filter Group� Provides a means by which a manager can instruct a
monitor to observe selected packets on a particular interface� Data filter
Allows the monitor to screen observed packets on the basis of a bit � Allows the monitor to screen observed packets on the basis of a bit pattern that a portion of the packet matches
� Status filter� Allows the monitor to screen observed packets on the basis of their
status� Valid, CRC error, …
� These filters can be combined using logical AND and OR operations to form a complex test to be applied to incoming packets
Filter Group
� Channel�The stream of packets that pass the test
� Maintains a count for such packets � Maintains a count for such packets
�Can be configured to generate an event when a packet passes through the channel and the channel is in an enabled state
� The packets passing through a channel can be captured if the mechanism is defined in the capture group
Filter Logic
� At the lowest level of the filter logic, a single data filter or status filter defines characteristics of a packet� Input = the incoming portion of a packet to be filtered� filterPktData = the bit pattern to be tested for� filterPktDataMask = the relevant bits to be tested for� filterPktDataMask = the relevant bits to be tested for� filterPktDataNotMask = indication of whether to test for a match or a mismatch
� An initial step: test the input against a bit pattern for a match� To screen for packets with a specific source address
� If ((input=^ filterPktData) == 0)� filterResult= match;
Filter Logic
� Mismatch test can be used to screen for all packets that didn’t
have the server as a source
� If ((input=^ filterPktData) != 0)� filterResult= mismatch;
� Don’t care bits are not relevant to the filter� If (((input=^ filterPktData) &filterPkDataMask) == 0)� filterResult= match_on_relevant_bits;� else� filterResult= mismatch_on_relevant_bits;
Filter Logic
� Test for an input that matches in certain relevant bit positions and mismatches in others� relevant_bits_diffrenet = (input^ filterPktData) & filterPktDataMask� Test for a match
� If ((relevant_bits_diffrenet & ~ filterPktDataNotMask) = 0 )� If ((relevant_bits_diffrenet & ~ filterPktDataNotMask) = 0 )� filterResult= successful_match;
� Test for a mismatch� If ((relevant_bits_diffrenet & filterPktDataNotMask) != 0 )� (filterPktDataNoMask = 0 ))� filterResult= successful_mismatch;
� filterPktDataNotMask� 0-bits: the positions where an exact match is required between the relevant bits
of input and filterPktData� 1-bits: the positions where a mismatch is required
An Example of Filter Test
� Accept all Ethernet packets that have a dest. addr of 0xA5 and that do not have a src addr of 0xBB
� The test can be implemented as follows:The test can be implemented as follows:� filterPktDataOffset =0 � filterPktData =0x0000000000A50000000000BB� filterPktDataMask =0xFFFFFFFFFFFFFFFFFFFFFFFF� filterPktDataNotmask=0x000000000000FFFFFFFFFFFF
Channel Definition
� A channel is defined by a set of filters� For each observed packet, and for each channel
� The packet is passed through each of the filters
� These filters are combined to determine whether a packet is � These filters are combined to determine whether a packet is accepted for a channel depends on the value of an object associated with the channel
� channelacceptType� Integer { acceptmatched(1), acceptFailed(2) }� 1: packet will be accepted for this channel if they pass both the packet
data & packet status matches of at least one of the associated filters� 2: packet will be accepted only if they failed either the packet data
match or the packet status match of every associated filter
acceptMatch
NOTNOT
acceptFailed
(packet is accepted)An event is generated if two conditions are met
Filter Group
� Consists of two control tables � filterTable�chanelTable
� Each row defines a unique channel� Associated with that channel are one or more rows in the
filterTable, which define the associated filters
filterTable� filterIndex
�An integer that uniquely identifies a row in the filterTable
�Each such row defines one data filter and one status that is to be applied to every packet received on an that is to be applied to every packet received on an interface
� filterChannelIndex�The channel of which this filter is a part
� filterPktDataOffset�Offset from the beginning of each packet where a
match of packet data will be attempted� filterPktData
�The data that is to be matched with the input packet
filterTable� filterPktDataMask
� The mask that is applied to the match process� filterPktDataNotMask
� The inversion mask that is applied to the mach process� filterPktStatus� filterPktStatus
� The status that is to be matched with the input packet� filterPktStatusMask
� The mask that is applied to the status match process� filterPktStatusNotMask
� The inversion mask that is applied to the status match process
chanelTable� channelIndex
� An integer that uniquely identifies one row in the channelTable� Each row defines one channel
� channelIfIndex� identifies the monitor interface (subnetwork) to which the associated filters
are applied to allow data into this channelare applied to allow data into this channel� The value of this object instance is an object identifier of ifIndex in the
interfaces group of MIB-II that corresponds to this interface� channelAcceptType
� controls the action of the filters associated with this channel � acceptMatched(1)
� Packets are accepted to this channel if they pass both the packet data and packet status matches of at least one of the associated filters
� acceptFailed(2) � Packets are accepted to this channel only if they fail either the packet data match or
the packet status match of every associated filter
chanelTable� ChannelDataControl
� on(1)� The data, status, and status, and events will flow through this channel
� Off(2)� The data, status, and events will not flow through this channel
� channelTurnOnEventIndex� Identifies the event that is configured to turn the associated channelDataControl � Identifies the event that is configured to turn the associated channelDataControl
from off to on when the event is generated� The value of this object identifies an object indexed by eventIndex in the event
group� If no such event exists, then no association exists� If no event is intended, this object has the value 0
� ChannelTurnOffEventIndex� Identifies the event that is configured to turn the associated channelDataControl
from on to off when the event is generated� The value of this object identifies an object indexed by eventIndex in the event
group� If no such event exists, then no association exists� If no event is intended, this object has the value 0
chanelTable� channelEventIndex
� Identifies the event that is configured to be generated when the associated channelDataControl is on and a packet is matched� The value of this object identifies an object indexed by eventIndex in the event group� If no such event exists, then no association exists� If no event is intended, this object has the value 0
� channelEventStatus� The event status of this channel� If the channel is configured to generate events when packets are matched, then the value of this object has
the following interpretation� eventReady(1) � eventReady(1)
� a single event will be generated for a packet match, after which this object is set to eventFired(2)� eventFired(2)
� no events are generated� This allows the management station to respond to the notification of an event and then reenable the object
� eventAlwaysReady(3) � every packet match generates an event
� channelMatches� A counter that records the number of packet matches� This counter is updated event when channelDataControl is set to off
� channelDescription� Gives a text description of the channel
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Packet Capture Group
� Used to set up a buffering scheme for capturing packets from one of the channels in the filter groupConsists of two tables � Consists of two tables �bufferControlTable
� Specifies the details of the buffering function� Each row defines one buffer that is used to capture and
store packets from one channel
�captureBufferTable� Buffers the data
bufferControlTable� bufferControlIndex
� An integer that uniquely identifies a row � The same integer is also used to identify corresponding rows
in the captureBufferTable� bufferControlChannelIndex� bufferControlChannelIndex
� Identifies the channel that is the source of packets for this row� The value matches that of channelIndex for one row of
channelTable� bufferControlFullStatus
� spaceAvailable(1) � The buffer has room to accept new packets
� full(2)� Its meaning depends on the values of buffControlFullAction
bufferControlTable� bufferControlFullAction
� lockWhenFull(1) � The buffer will accept no more packets after it becomes full
� wrapWhenFull(2) � The buffer acts as a circular buffer after it becomes full, deleting enough
of the oldest packets to make room for new ones as they arrive
� bufferControlCaptureSliceSize� Maximum number of octets of each packet, starting with the
beginning of the packet, that will be saved in this capture buffer
� 0: The buffer will save as many octets as possible� The default value is 100
bufferControlTable� bufferControlDownloadSliceSize
� The maximum number of octets of each packet in this buffer that will be returned in a single SNMP retrieval of that packet
� bufferControlDownloadOffset� Gives the offset of the first octets of each packet in this buffer that will be returned in
a single SNMP retrieval of that packet� bufferControlMaxOctetsRequestedbufferControlMaxOctetsRequested
� The requested buffer size in octets� A value of -1 requests that the buffer be as large possible
� bufferControl MaxOctetsGranted� The granted buffer size in octets� Is the maximum number of octets that can be saved
� bufferControlCapturedPackets� Indicated the number of packets currently in this buffer
� bufferControlCapturedPackets� Gives the value of sysUpTime when this buffer was first turned on
captureBufferTable� Is the data table which contains one row for each packet captured� Includes the following objects:
� captureBufferControlIndex� The buffer with which this packet is associated� The buffer identified by a particular value of this index is the same buffer as that � The buffer identified by a particular value of this index is the same buffer as that
identified by the same value of bufferControlIndex� captureBufferIndex
� An index that uniquely identifies this particular packet among all packets associated with the same buffer
� Starts at 1 and increase by one as each new packet is captured� Serves as a sequence number for packets in one buffer
� captureBufferPacketID� An index that describes the order of packets that are received on a particular
interface� Serves as a sequence number for packets that are captured from one
subnetwork, regardless of which buffer(s) they are stored in
captureBufferTable
� captureBufferPacketData� Gives the actual packet data stored for this row
� captureBufferPacketLength� the actual length of the packet as received � the actual length of the packet as received � it may be that only a part of the packet is actually stored in this
entry� captureBufferPacketTime
� Indicated the number of milliseconds that had passed from the time that the buffer was turned on to the time that this packet was captured
� captureBufferPacketStatus� Indicates the error status of this packet
� A related set of parameters dictates how much of a packet is stored in the buffer and how much is available for delivery to a management station in one SNMP Get or Get-Next request� CS = bufferControlCaptureSliceSize: the maximum number of octets of
each packet that will be saved in this capture buffereach packet that will be saved in this capture buffer� DS = bufferControlDownloadSliceSize: the maximum number of octets of
each packet in this buffer that will be returned in an SNMP retrieval of that packet
� DO = bufferControlDownloadOffset: the offset of the first octet of each packet in this buffer that will be returned in an SNMP retrieval of that packet
� PL = caputreBufferPackertLength: the actual length of the packet � PDL = length of captureBufferPacketData: the actual packet data stored for
this row of captureBufferTable
� PDL = MIN[PL,CS]
PDL = MIN[PL,CS]� (if PL≤CS) or packet slice (if PL>Cs) is stored as a single OCTET STRING in
one row of captureBufferTable� this OCTET STRING may well be longer than will fit in a single SNMP
message� The parameters DO and DS provide a tool to retrieve the captured packet in
piecespieces� If you set DO to 0 and DS to 100, then get captureBufferPacketData, you will
get octets 0. .MIN(actualStoredData -1, 99) � If tyou set DO to 100, then get captureBufferPacketData, you will get octets
100. .MIN(actualStoredData -1, 199); and so on� If the station reads “off the end of the packet,” it gets a zero-length string� A management station would set DO to 0, DS to 100 or so, then make a
complete pass through the table, getting PL, the first 100 bytes of the packet, and maybe PacketStatus, and so forth
� the station would set DO to 100 and make another pass through to get more of each packet, and so on, until all of the captured data for those packets of interest had been retrieved
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Event Group� Supports the definition of events
� An event is triggered by a condition located elsewhere in the MIB
� An event can trigger an action defined elsewhere in the MIB� An event may cause information to be logged in this group and � An event may cause information to be logged in this group and
may cause an SNMP trap message to be issued� An event can be used to trigger activity related to another
group� an event can trigger turning a channel on of off
� The event group consists � control table - eventTable
� contains event to be generated when certain conditions are met� data table - logTable
� If an event is to be logged, entries will be created in the associated logTable
eventTable� eventIndex
� an integer that uniquely identifies a row � The same integer is also used to identify corresponding rows in the logTable
� eventDescription� a textual description of this event
� eventType� eventType� none(1)� log(2)
� an entry is made in the log table for each event� snmp-trap(3)
� an SNMP trap is sent to one or more management station for each event� log-and-trap(4)
� eventCommunity� specifies the community of management stations to receive the trap if an SNMP
trap is to be sent � eventLastTimeSent
� the value of sysUpTime at the time this event entry last generated an event
logTable
� logEventIndex� Identifies the event that generated this log entry� The value of this index refers to the same event as identified by the
same value of eventIndex� logIndex� logIndex
� An index that uniquely identifies this particular log entry among all entries associated with the same event type
� This index starts at 1 and increases by one as each new packet is captured
� logTime� this gives the value of sysUpTime when this log entry was created
� logDescription� this is an implementation-dependent description of the event that
activated this log entry
Event Group
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Packet Capture
� The packet capture feature of RMON is useful if used intelligently
� A manager detects a specific problem area in the internet, it may be possible to zero in on a few nodes or internet, it may be possible to zero in on a few nodes or a particular protocol as the suspected site(s) of the trouble � With appropriate filtering, using RMON to gather some raw
data for diagnosis� A broadcast storm problem
� RMON can be used to capture packets to and from the suspect device, for analysis by the network manager at the management station
Packet Capture Overload
� RMON is so rich that there is the very real danger of overloading � The monitor� The internet between the monitor and the management station
Management station� Management station� The manager can develop RMON requests with some
precision� A complex filter allows the monitor to capture and report a
limited amount of data, thus avoiding an undue burden on the network
� However, complex filters consume quite a bit of processing power at the monitor
� If too many filters are defined, the monitor may not be able to keep up
Packet Capture Overload� A performance test of various RMON products conducted by Syracuse
University at 1995� Used a 25-node network and configured a minimal number of RMON functions
� One 30-second and one 30-minute History study� One Statistics� Matrix� Matrix� HostTopN
� Set up a filter for the Ethernet source address and from that address transmitted a total of 602 packets, with a gap of 500 microseconds between packets
� An additional 7,000 packets per second were generated to give a utilization level of about 70%
� The experiment was run twice � Once with 7,000 broadcast packets � Once with 7,000 unicast packets providing the background load
RMON Probe Performance� Only two of the products captured all 602 packets in both tests� Two more captured all packets in the Unicast case
Network Inventory
� It is difficult for a management station to determine the identity of all of the agents�The SNMP design philosophy discourages agents The SNMP design philosophy discourages agents
from initiating communication strictly for the purpose of letting management stations know they exist
� RMON solves the problem�Provides a handy way to maintain an inventory of all
devices on the network that are capable sending or receiving packets of data
Hardware Platform
� The choice of platform depends on the size and complexity of the given subnetwork�Dedicated platform
� Personal computer, workstation� For a high-traffic subnetwork, such as backbone
�Nondedicated platform� A bridge, switch, router, a management station� For those subnetworks that have relatively light traffic and
don’t absolutely require 100% uptime� A departmental LAN
Interoperability� An RMON manager program can work with a variety of RMON
probes� Winterfold Datacomm found a number of interoperability
problems� Differences in the way RMON managers manipulate the tables that define
the tasks running in probes and agents can prevent some RMON managers from retrieving needed information
� Those same differences also can prevent managers from getting a complete view of all the tasks assigned to a given RMON probe by other RMON managers
� This leaves the manager unable to control the agent’s resources, which means it may not be able to create new tasks to run the agent
� Packet capture is unreliable in a multivendor environment
Outline
� Introduction� Alarm Group� Filter Group� Filter Group� Packet capture Group� Event Group� Practical Issues� Summary
Summary� In addition to providing a capability to collect traffic statistics from
subnetworks� RMON provides features that enable the definition of events and
alarm and that define packet stream filters and capturing logic� Alarm� Alarm
� Allows the person at management console to set a sampling interval and alarm threshold for any counter or integer recorded by the RMON probe
� Filter� Allows the monitor to observe packets that match a filter� The monitor may either capture all packets that pass the filter or simply record
statistics based on such packets� Packet capture
� Governs how data are sent to a management console� Event
� A table of all events generated by the RMON probe