PDF Tnpm Troubleshooting

download PDF Tnpm Troubleshooting

of 186

Transcript of PDF Tnpm Troubleshooting

  • 8/10/2019 PDF Tnpm Troubleshooting

    1/186

    IBM Tivoli Netcool Performance Manager 1.3.2Wireline ComponentDocument Revision R2E2

    Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    2/186

    NoteBefore using this information and the product it supports, read the information in Notices on page 177.

    Copyright IBM Corporation 2011, 2013.US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

  • 8/10/2019 PDF Tnpm Troubleshooting

    3/186

    Contents

    Chapter 1. Troubleshooting TivoliNetcool Performance Manager . . . . . 1Troubleshooting a problem . . . . . . . . . 1Troubleshooting checklist for Tivoli NetcoolPerformance Manager . . . . . . . . . . . 3Known problems and solutions . . . . . . . . 3Troubleshooting tasks . . . . . . . . . . . 3

    Real-time charts do not work as expected. . . . 3Error: ORA-00001:unique constraint(PV_ADMIN.PK_SEGM) violated . . . . . . 4MDE memory constraint . . . . . . . . . 5Incomplete SNMPv3 metric collection . . . . . 5After upgrading Tivoli Common Reporting, it isnot possible to log into Tivoli Integrated Portal . . 6Collectors swapping from idle to running atstartup . . . . . . . . . . . . . . . 6

    Searching knowledge bases . . . . . . . . . 7

    Chapter 2. Logs (Wireline Component) . 9Overview . . . . . . . . . . . . . . . 9Logs by component . . . . . . . . . . . . 9

    Installation log files . . . . . . . . . . . 9DataChannel logs . . . . . . . . . . . 11DataLoad logs . . . . . . . . . . . . 12DataMart logs . . . . . . . . . . . . 13DataView logs . . . . . . . . . . . . 14Database log . . . . . . . . . . . . . 14

    Logs messages format . . . . . . . . . . . 14Logging configuration and information utilities . . 15

    DataChannel logs configuration . . . . . . 1 5DataView logs configuration. . . . . . . . 1 8statGet utility. . . . . . . . . . . . . 18

    Configuring trace and logging . . . . . . . . 1 9Default logging level . . . . . . . . . . 20Trace logging for DataView . . . . . . . . 2 0The configure command . . . . . . . . . 20

    Troubleshooting . . . . . . . . . . . . . 22Event IDs . . . . . . . . . . . . . . . 22

    Chapter 3. Contacting IBM support . . 23Exchanging information with IBM. . . . . . . 2 3

    Sending information to IBM Support . . . . . 2 4

    Chapter 4. Introduction SNMPInventory. . . . . . . . . . . . . . 25Overview . . . . . . . . . . . . . . . 25

    Discovery . . . . . . . . . . . . . . . 25Metrics and Properties . . . . . . . . . . 2 6

    Inventory Synchronization and ChangeManagement . . . . . . . . . . . . . . 26Change Management for Elements . . . . . 2 6Change Management for Sub-Elements . . . . 2 7

    Grouping Sub-Elements . . . . . . . . . . 2 8Where to Go From Here . . . . . . . . . . 28

    Chapter 5. SNMP inventorytroubleshooting . . . . . . . . . . . 29Overview . . . . . . . . . . . . . . . 29Discovery Troubleshooting . . . . . . . . . 3 1

    Discovery Does Not Start . . . . . . . . . 3 1Discovery Starts But Issues Warning Messages. . 37Discovery Seems to Hang or Never Finishes . . 45

    Synchronization Troubleshooting . . . . . . . 4 7Synchronization (Elements) . . . . . . . . 4 7Synchronization (Sub-elements). . . . . . . 5 3

    Grouping Troubleshooting . . . . . . . . . 5 7Monitoring the Tivoli Netcool Performance ManagerLog File . . . . . . . . . . . . . . . 57Tivoli Netcool Performance Manager Log Messages 57Burned subelements . . . . . . . . . . . 5 7

    Scenario 1 - Instance Shift Causes Disconnect . . 57Scenario 2 - Instance Shift Causes Burn . . . . 5 8

    Where to Go From Here . . . . . . . . . . 59

    Chapter 6. SNMP inventory

    management . . . . . . . . . . . . 61Regular monitoring . . . . . . . . . . . . 6 1Routine SNMP inventory management tasks . . . 61

    Finding Elements and subelements about toreach their retry limit . . . . . . . . . . 61Finding Elements and Sub-elements That HaveBeen Retired . . . . . . . . . . . . . 63

    Where to go from here . . . . . . . . . . 64

    Chapter 7. Messages . . . . . . . . . 65DataChannel error messages . . . . . . . . . 6 5DataLoad error messages . . . . . . . . . 12 1DataView operational messages . . . . . . . 16 3

    Notices . . . . . . . . . . . . . . 177

    Copyright IBM Corp. 2011, 2013 iii

  • 8/10/2019 PDF Tnpm Troubleshooting

    4/186

    iv IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    5/186

    Chapter 1. Troubleshooting Tivoli Netcool PerformanceManager

    You can use this troubleshooting and support information to troubleshootproblems with Tivoli Netcool Performance Manager.

    This information assumes a working installation of Tivoli Netcool PerformanceManager. For installation or upgrade problems, refer to the installation andupgrade information.

    Troubleshooting a problemTroubleshooting is a systematic approach to solving a problem. The goal of troubleshooting is to determine why something does not work as expected andhow to resolve the problem.

    The first step in the troubleshooting process is to describe the problem completely.Problem descriptions help you and the IBM technical-support representative knowwhere to start to find the cause of the problem. This step includes asking yourself basic questions:

    v What are the symptoms of the problem?v Where does the problem occur?v When does the problem occur?v Under which conditions does the problem occur?v Can the problem be reproduced?

    The answers to these questions typically lead to a good description of the problem,which can then lead you a problem resolution.

    What are the symptoms of the problem?

    When starting to describe a problem, the most obvious question is What is theproblem? This question might seem straightforward; however, you can break itdown into several more-focused questions that create a more descriptive picture of the problem. These questions can include:v Who, or what, is reporting the problem?v What are the error codes and messages?v How does the system fail? For example, is it a loop, hang, crash, performance

    degradation, or incorrect result?

    Where does the problem occur?

    Determining where the problem originates is not always easy, but it is one of themost important steps in resolving a problem. Many layers of technology can exist between the reporting and failing components. Networks, disks, and drivers areonly a few of the components to consider when you are investigating problems.

    The following questions help you to focus on where the problem occurs to isolatethe problem layer:

    Copyright IBM Corp. 2011, 2013 1

  • 8/10/2019 PDF Tnpm Troubleshooting

    6/186

    v Is the problem specific to one platform or operating system, or is it commonacross multiple platforms or operating systems?

    v Is the current environment and configuration supported?

    If one layer reports the problem, the problem does not necessarily originate in thatlayer. Part of identifying where a problem originates is understanding theenvironment in which it exists. Take some time to completely describe the problemenvironment, including the operating system and version, all correspondingsoftware and versions, and hardware information. Confirm that you are runningwithin an environment that is a supported configuration; many problems can betraced back to incompatible levels of software that are not intended to run togetheror have not been fully tested together.

    When does the problem occur?

    Develop a detailed timeline of events leading up to a failure, especially for thosecases that are one-time occurrences. You can most easily develop a timeline byworking backward: Start at the time an error was reported (as precisely as possible,even down to the millisecond), and work backward through the available logs and

    information. Typically, you need to look only as far as the first suspicious eventthat you find in a diagnostic log.

    To develop a detailed timeline of events, answer these questions:v Does the problem happen only at a certain time of day or night?v How often does the problem happen?v What sequence of events leads up to the time that the problem is reported?v Does the problem happen after an environment change, such as upgrading or

    installing software or hardware?

    Responding to these types of questions can give you a frame of reference in whichto investigate the problem.

    Under which conditions does the problem occur?

    Knowing which systems and applications are running at the time that a problemoccurs is an important part of troubleshooting. These questions about yourenvironment can help you to identify the root cause of the problem:v Does the problem always occur when the same task is being performed?v Does a certain sequence of events need to occur for the problem to surface?v Do any other applications fail at the same time?

    Answering these types of questions can help you explain the environment inwhich the problem occurs and correlate any dependencies. Remember that just

    because multiple problems might have occurred around the same time, theproblems are not necessarily related.

    Can the problem be reproduced?

    From a troubleshooting standpoint, the ideal problem is one that can bereproduced. Typically, when a problem can be reproduced you have a larger set of tools or procedures at your disposal to help you investigate. Consequently,problems that you can reproduce are often easier to debug and solve. However,problems that you can reproduce can have a disadvantage: If the problem is of significant business impact, you do not want it to recur. If possible, re-create the

    2 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    7/186

    problem in a test or development environment, which typically offers you moreflexibility and control during your investigation.v Can the problem be re-created on a test system?v Are multiple users or applications encountering the same type of problem?v Can the problem be re-created by running a single command, a set of

    commands, or a particular application?

    Searching knowledge bases on page 7You can often find solutions to problems by searching IBM knowledge bases.You can optimize your results by using available resources, support tools, andsearch methods.

    Troubleshooting checklist for Tivoli Netcool Performance ManagerBy answering a set of questions that are structured into a checklist, you cansometimes identify the cause of a problem and find a resolution to the problem onyour own.

    Answering the following questions can help you to identify the source of aproblem that is occurring with Tivoli Netcool Performance Manager:1. Is your issue a known problem?2. Is the configuration supported?3. What are you doing when the problem occurs?

    v Installing, upgrading, or migrating the productv Doing administration tasksv Doing authorization tasksv Networkingv Using the product

    4. What, if any, error messages or error codes were issued?5. If the checklist does not guide you to a resolution, collect additional diagnostic

    data. This data is necessary for an IBM technical-support representative toeffectively troubleshoot and assist you in resolving the problem.

    Known problems and solutionsA list of known problems and their solutions are described here.

    For a list of known problems, visit the following Web site:Known Issues with Tivoli Netcool Performance Manager 1.3 - Wireline Component

    Troubleshooting tasks

    Some troubleshooting tasks in Tivoli Netcool Performance Manager are describedhere.

    Real-time charts do not work as expectedSymptomsSometimes when you restart the Tivoli Netcool Performance Manager ChannelName Server (CNS) or Channel Manager (CMGR) component, your real-timecharts no longer work as expected. Real-time charts require a valid reference to thereal-time subscriber object. A restart of these components can cause the ChannelName Server (CNS) to no longer provide a valid reference to the real-timesubscriber object.

    Chapter 1. Troubleshooting Tivoli Netcool Performance Manager 3

    http://%20http//www-01.ibm.com/support/docview.wss?uid=swg21428805http://%20http//www-01.ibm.com/support/docview.wss?uid=swg21428805
  • 8/10/2019 PDF Tnpm Troubleshooting

    8/186

    Resolving the problemTo restore the proper operation of your real-time charts in this situation:

    User response:1. On the DataChannel host, change to the $DCHOME/bin directory. For example:

    cd /opt/datachannel/bin2.

    Find the PID of the DataChannel CNS component../findvisual | grep CNS_visualYou can see an output like the following:pvuser 653 648 0 Apr 14 ?129:46 /opt/datachannel/bin/CNS_visual-nologo /opt/datachannel/bin/dc.im -a CNS -f /o

    Note: The PID is the first number after your login ID. For example, in theprevious output, the PID is 653.

    3. Stop CNS by using the following command:kill -9 CNS_pid

    4. Find the PID of the DataChannel CMGR component by using the followingcommand:./findvisual | grep CMGR_visual

    5. Stop CMGR by using the following command:kill -9 CMGR_pid

    6. On the DataView host, stop the Tivoli Integrated Portal server by using thefollowing command:stopServer server1 -user tip_user -password tip_password

    7. On the DataChannel host, change to the $DCHOME/bin directory. For example:cd /opt/datachannel/bin

    8. Restart CNS and CMGR by running the following commands in the followingorder:./cnsw./cmgrw

    9. On the DataView host, restart the Tivoli Integrated Portal server by using thefollowing command:startServer server1 -user tip_user -password tip_password

    Error: ORA-00001:unique constraint (PV_ADMIN.PK_SEGM)violated

    An unique constraint error is generated during inventory and grouping.

    SymptomsThe following message is received in inventory and grouping output:-Error: ORA-00001:unique constraint (PV_ADMIN.PK_SEGM) violated-Reason: a static group membership link exists between a sub-elements and a group

    CausesA user uses the resmgr command to insert a resource into a group and then runsinventory grouping. If inventory grouping tries to place the same resource into thegroup, a unique constraint error is generated.

    Resolving the problemUsing the resmgr command, delete the grouping link.

    4 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    9/186

    MDE memory constraintAn unique constraint error is generated when running an MDE query.

    SymptomsThe following message is received you ttempt to run MDE query:GYMVD0002E: Unable to retrieve information from the data source" error.

    orMon Feb 1 08:53:51 2010ORA-1652: unable to extend temp segment by 64 in tablespace PV_LOIS_TE

    CausesYou are constrained by the available space in PV_LOIS_TEMP.

    This is a TEMP tablespace that is shared across all MDE sessions. A user uses theresmgr command to insert a resource into a group and then runs inventorygrouping. If inventory grouping tries to place the same resource into the group, aunique constraint error is generated.

    Resolving the problemThis error can be addressed with the following suggestions:v Reduce the time period used in the MDE query. If the time range is halved then

    the space required should be halved.v Reduce the number of metrics requested. If the original request was for 5

    metrics and you can reduce this by 1 metric, it should require approximately20% less space.

    v If neither option is possible, then you can increase the size of thePV_LOIS_TEMP tablespace.

    Incomplete SNMPv3 metric collectionSome data loss can be observed from the Expected measures and Producedmeasures in the SNMP collector log message.

    SymptomsSome data loss can be observed from the Expected measures and Producedmeasures in the SNMP collector log message, especially when there is a suddendrop in Produced measures when compared with past hours of collection.

    For example:DL31066 I DL_PERF_SUMMARY Hour: subElmts: metrics:

  • 8/10/2019 PDF Tnpm Troubleshooting

    10/186

    If the system resources cannot be increased, decrease the concurrency in thecollector.

    Note: This decreases the overall performance of the collector.

    Concurrency is controlled by the GLOBAL.SNMP.MAXASYNC parameter, which isgiven the default value of 256. The default value can be changed by adding aCustom Datachannel parameter in the TNPM topology. This parameter can beadded using the Topology Editor: GLOBAL.SNMP.MAXASYNC= where isa number less than 256.

    After upgrading Tivoli Common Reporting, it is not possible tolog into Tivoli Integrated Portal

    SymptomsAfter upgrading Tivoli Common Reporting, it is not possible to log into TivoliIntegrated Portal.

    Resolving the problemAfter upgrade of Tivoli Common Reporting, if you are experiencing problemslogging into Tivoli Integrated Portal, do the following you need to and :

    User response:1. Clear your Browser cache for the Tivoli Integrated Portal server

    For example:v In Firefox (v3.6):

    a. Click Tools > Options > Privacyb. Remove individual cookies.c. Search using your Tivoli Integrated Portal server name and remove those

    cookies.2. Restart your browser.3. log in to Tivoli Integrated Portal.

    Collectors swapping from idle to running at startupSymptomsThe situation may occur that a collector, upon startup, waits a period of time, thentransitions itself to running state only to be swapped to an idle state by the HighAvailability Manager (HAM).

    CausesAt startup the Collector sits in 'Idle' state waiting to determine what it should donext ( if anything ). The HAM (High Availability Manager) should probe theCollector, discover its state, and instruct it to 'start' or to 'stay in idle' or to 'load'

    Start leads to running (active collection), load leads to ready (a sort of hot-sparemode ) and staying in idle means keep waiting.

    If a network disconnect occurred between the collector and HAM, the Collectormay have a configuration (channel & collector number) from its previous run state.The result would be that the collector waits a period of time, then transitions itself to running state only to be swapped to an idle state by the HAM. The symptom isa result of the timing of the HAM's probing, and the Collector's idle-timeout.

    Resolving the problem

    6 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    11/186

    User response:

    If you this behaviour a problem, you update either the polling interval of theHAM or the idle timeout of the collector:v Increase the Collector's Idle Timeout by editing DataChannels > Global

    DataChannel Properties > Advanced Properties > IDLETIMEOUT , orv

    Decrease the HAM's probe interval by editing DataChannels > AdministrativeComponents > High Availibility Managers > > Properties >POLL_INTERVAL

    Searching knowledge basesYou can often find solutions to problems by searching IBM knowledge bases. Youcan optimize your results by using available resources, support tools, and searchmethods.

    About this task

    You can find useful information by searching the information center for Tivoli

    Netcool Performance Manager, but sometimes you need to look beyond theinformation center to answer your questions or resolve problems.

    Procedure

    To search knowledge bases for information that you need, use one or more of thefollowing approaches:v Search for content by using the IBM Support Assistant (ISA).

    ISA is a no-charge software serviceability workbench that helps you answerquestions and resolve problems with IBM software products. You can findinstructions for downloading and installing ISA on the ISA website.

    v Find the content that you need by using the IBM Support Portal.

    The IBM Support Portal is a unified, centralized view of all technical supporttools and information for all IBM systems, software, and services. The IBMSupport Portal lets you access the IBM electronic support portfolio from oneplace. You can tailor the pages to focus on the information and resources thatyou need for problem prevention and faster problem resolution. Familiarizeyourself with the IBM Support Portal by viewing the demo videos(https://www.ibm.com/blogs/SPNA/entry/the_ibm_support_portal_videos)about this tool. These videos introduce you to the IBM Support Portal, exploretroubleshooting and other resources, and demonstrate how you can tailor thepage by moving, adding, and deleting portlets.

    v Search for content about Product X by using one of the following additionaltechnical resources:

    Tivoli Netcool Performance Manager technotes Tivoli Netcool Performance Manager Support Website Tivoli support communities (forums and newsgroups)

    v Search for content by using the IBM masthead search. You can use the IBMmasthead search by typing your search string into the Search field at the top of any ibm.com page.

    v Search for content by using any external search engine, such as Google, Yahoo,or Bing. If you use an external search engine, your results are more likely toinclude information that is outside the ibm.com domain. However, sometimes

    Chapter 1. Troubleshooting Tivoli Netcool Performance Manager 7

    http://www.ibm.com/software/support/isa/https://www.ibm.com/blogs/SPNA/entry/the_ibm_support_portal_videoshttp://www.ibm.com/search/csass/search?q=version%201.3&sn=spe&lang=en&filter=collection:stgsysx,dblue,ic,pubs,devrel1&prod=S153855L88022N79#q%253dtechnotes%2520and%2520apars%2526filter%253d%252bcollection%253astgsysx%252cdblue%252cic%252cpubs%252cdevrel1%2526prod%253dS153855L88022N79%2526sn%253dspe%2526lang%253den%2526sortby%253d%2526o%253d0http://www-947.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Netcool_Performance_Managerhttp://www.ibm.com/software/sysmgmt/products/support/Tivoli_Communities.htmlhttp://www.ibm.com/software/sysmgmt/products/support/Tivoli_Communities.htmlhttp://www.ibm.com/software/sysmgmt/products/support/Tivoli_Communities.htmlhttp://www.ibm.com/software/sysmgmt/products/support/Tivoli_Communities.htmlhttp://www-947.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Netcool_Performance_Managerhttp://www.ibm.com/search/csass/search?q=version%201.3&sn=spe&lang=en&filter=collection:stgsysx,dblue,ic,pubs,devrel1&prod=S153855L88022N79#q%253dtechnotes%2520and%2520apars%2526filter%253d%252bcollection%253astgsysx%252cdblue%252cic%252cpubs%252cdevrel1%2526prod%253dS153855L88022N79%2526sn%253dspe%2526lang%253den%2526sortby%253d%2526o%253d0https://www.ibm.com/blogs/SPNA/entry/the_ibm_support_portal_videoshttp://www.ibm.com/software/support/isa/
  • 8/10/2019 PDF Tnpm Troubleshooting

    12/186

    you can find useful problem-solving information about IBM products innewsgroups, forums, and blogs that are not on ibm.com.

    Tip: Include IBM and the name of the product in your search if you arelooking for information about an IBM product.

    8 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    13/186

    Chapter 2. Logs (Wireline Component)

    Tivoli Netcool Performance Manager has various logs that can be used to examineprocessing results and problems.

    OverviewThe following table provides a high-level description of Tivoli Netcool PerformanceManager logs.

    Table 1. The Tivoli Netcool Performance Manager logs by component

    Component Description

    DataChannel DataChannel manages the primary Tivoli Netcool Performance Managerproviso.log log file. This log file collects data from each DataChannelcomponent, and from the SNMP and BCOL collectors. DataChannel also createsthe tnpmlog.log log file that contains all logs of interest to the user.

    DataLoad Separate log files capture events about the daemon start-stop sequence, SNMPactivity, and watchdog queries to determine that status of the daemon. Logdata is also written to the primary DataChannel log file.

    DataView Records data about web transactions and database calls.

    DataMart A number of log files containing messages related to inventory, internal TivoliNetcool Performance Manager communications, and database status.

    Database The standard Oracle database log file.

    Logs by componentA description of Tivoli Netcool Performance Manager logs organized by

    component.

    Installation log filesA list of log files created by the Tivoli Netcool Performance Manager WirelineComponents during installation. Where relevant, recommendations regarding thedeletion of log files is provided.

    Databasev / opt/Proviso/*

    (do not delete these files as they are used during upgrades and for maintenance.Only delete these files on uninstallation.)

    v /var/tmp/PvInstall/install.cfgv /var/tmp/PvInstall/install.log

    DataChannel

    DataChannel installer logs to stdout and stderr so there are no logs resulting fromthe installation process.

    Copyright IBM Corp. 2011, 2013 9

  • 8/10/2019 PDF Tnpm Troubleshooting

    14/186

    Tivoli Integrated Portal

    $home represents the root location for the Tivoli Integrated Portal.v Install logs

    $home/IA-TIPInstall-XX.log $home/TCR13InstallTrace00.log

    $home/TCR13InstallMessage00.logv Uninstall logs

    $home//IA-TIPUninstall-XX.log

    There are also logs created for the Deployment Engine, these are not entirely TivoliIntegrated Portal or Tivoli Common Reporting related.

    DataViewv $home/DataView_InstallLog.log

    DataLoadv /tmp/dlSetup_install_`date +%Y.%m.%d`

    DataMartv /var/tmp/PvInstall/install.log

    Installation

    The log resulting from the installation of the Topology Editor is located at:

    /Topology_Editor_InstallLog.log.

    The logs resulting from any run of the Deployer are created in/tmp/ProvisoConsumer.

    The main log file is:v /tmp/ProvisoConsumer/log.txt

    Deployment plan logs:v /tmp/ProvisoConsumer/Plan/logs/[INSTALL_

    Deployer Ant logs:v /tmp/ProvisoConsumer/Plan/MachinePlan_/logs

    The best cleanup method for installation logs is to remove the/tmp/ProvisoConsumer directory when the system has been running long enough tovalidate all aspects of a working installation. A ProvisoConsumer directory iscreated in the /tmp directory of each server in the installation, not just for theprimary host.

    Topology Editor logs are located at:/topologyEditor/topologyEditorTrace.log

    Note: A new log file is created each time the Topology Editor is run.

    10 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    15/186

  • 8/10/2019 PDF Tnpm Troubleshooting

    16/186

    TNPM product identifier : "GYM" Component : DC Message identifier : 0412 Severity level character : W=warning (can also be I =information, E=error)

    v Category : DISCARDED_RECORDSv Message Text : Got 4 duplicate records, Discarded 0, Total of 4 for mid:

    2206 rid: 200023263If log messages do not have a message ID, then only the severity is shown.

    Log message example:V1:1234 2007.01.16-06.02.03 UTC CME.1.1-25201:3456 GYMDC0412W DISCARDED_RECORDGot 4 duplicate records, Discarded 0, Total of 4 for mid: 2206rid: 200023263 4|4|2006| 200023263

    Walkback logsWalkback logs are generated when a Tivoli Netcool Performance Managerapplication encounters serious problems. In most cases, walkback logs areproduced just before the application shuts down because of the error.

    These logs are crucial for problem determination by IBM Technical Support. Thename of the file begins with walkback- and includes the DataChannel componentand timestamp (for example, walkback-UBA.1.2-18032-2007.08.21-16.51.32.log ).

    Unless you have advanced knowledge of the DataChannel, only the first few linesof a walkback log are useful. The following log entry is an example:EXCEPTION: ORA-01034: ORACLE not availableORA-27101: shared memory realm does not existSVR4 Error: 2: No such file or directoryFACILITY_NAME: LDR.1-11827

    Release: 4.4.1 R2E2Build: Guam.156ORIGINATOR: an OracleThreadedConnection( hsvcctx = 245F374 )PARAMETER: OrderedCollection (an OracleError)TEXT: ORA-01034: ORACLE not available

    ORA-27101: shared memory realm does not existSVR4 Error: 2: No such file or directory

    Note: Walkback files must be manually deleted. Due to the DataChannel cronsettings, a process that fails is repeated every 5 minutes. Since a new walkback logis generated upon every failure, the result can be many unneeded files.

    DataLoad logsDataLoad logs include the SNMP.log, pvmdmgr.log, and WatchDog.log logs. Thedefault location of these logs is the /DLHOME/log directory, where DLHOME is thelocation of DataLoad on your system.

    SNMP logThe SNMP log file contains detailed messages about all SNMP requests. All SNMPlog files are created with the date in their name. Also, the name of the local SNMPlog file also includes the collector number, for example: 2010.04.27SNMP.1.1.log

    By default, SNMP logs include the following recurring events:v Close hour eventsv Debug level changesv Start and stop messages

    12 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    17/186

    Note: Events reported in this log are also reported to the DataChannel log.

    Only the current debug level can be set by using the Collector Information Tool asdescribed in the IBM Tivoli Netcool Performance Manager: DataMart OperationGuide. On restart, the collector switches back to the permanent debug level (which by default is Fatal + Warning + Info messages). The permanent debug level is setthrough the Topology Editor. Changing the debug level is useful when a specificnetwork device or device group does not respond correctly to SNMP requests.

    Pvmdmgr.logThe pvmdmgr.log file stores events about the start and stop sequence of thedaemon.

    Occasionally, the PVM Collecting daemon is running message is displayed beforethe process is in complete run mode.

    WatchDog logsThe daily DataLoad WatchDog.log file contains entries about pings sent to thecollector to ensure that the daemon is still running. The name of this log beginswith the date for which events are records, for example 2007.08.16WatchDog.log.

    DataMart logsDataMart does not have a central log. Instead, DataMart information is written tolog files, such as logFile.PVM or TraceInventory.log , that are associated withindividual components or actions, such as Inventory. The default location of theselogs is the / DMHOME /log directory, where DMHOME is the location of DataMart onyour system.

    Note: The inventory process does NOT automatically create a log. If inventory isrun as a cron entry, you must redirect the data to a specified log file, otherwise nolog data is stored. To show time stamps in inventory log initiated by cron, youmust add the following lines to the dataMart.env file:PVM_LOG_DATE=1export PVM_LOG_DATE

    TraceInventory.logThis log is created when the SNMP Inventory GUI is used. The log file containsmessages that sequentially indicate the processing status of an inventory. Thefollowing example shows a typical Discover_Analyze entry in this log.

    logFile.POLLPROFILE.{collector ID}This log contains messages related to bulk inventory, and one file is produced foreach bulk collector profile. The file suffix follows the format .bulk_n , where n isthe number of the collector.

    Chapter 2. Logs (Wireline Component) 13

  • 8/10/2019 PDF Tnpm Troubleshooting

    18/186

    logFile.*This type of log file records minor events related to GUI or module function and isgenerated as required. Examples of this type of log file includelogFile.POLLINVENTORY and logFile.RESMGR.

    provisoinfod*.logThis log contains messages about internal communications and is of limited use fortroubleshooting. At the end of each day, a UNIX timestamp is appended to the filename (for example, provisoinfod1187726401534.log).

    NotifyDBSpace*.log fileThe NotifyDBSpace*.log file is a daily automatic file containing messages about thestatus of the database. At the end of each day, a UNIX timestamp is appended tothe file name (for example, NotifyDBSpace1190102314443.log).

    DataView logsDataView does not have a central log. Instead, DataView writes DataView traces tothe Tivoli Integrated Portal.

    Note: DataView log messages and configuration options are covered in the IBMTivoli Netcool Performance Manager: DataView User and Administrator Guide,under Configuring trace and logging in Chapter 4: Administration tasks.

    Database logThe Tivoli Netcool Performance Manager database uses the Oracle-supplied log.The default location for this log file is /ORACLEHOME/admin/PV/bdump/alert_PV.logdirectory, where ORACLEHOME is the location of Oracle on your system.

    Logs messages formatIn general, each log message indicates the date, time, Tivoli Netcool Performance

    Manager component, severity code, event ID, and event description.

    The following table describes log message elements.

    Table 2. Log message elements.

    Field Description

    Date andTime

    Date and time using the following format:..-..

    Timezone Always UTC.

    14 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    19/186

    Table 2. Log message elements. (continued)

    Field Description

    Component Name of Tivoli Netcool Performance Manager component and its process IDseparated by a dash (-) . Names of Tivoli Netcool Performance Managercomponents are defined privately for each subsystem. DataChannel useschannel-based naming conventions (for example, CME.1.1), other subsystemscan develop their own conventions. Some components include both theprocess ID and thread ID separated by a colon (for example,CME.1.1-5638:415).

    SeverityCode

    Event severity code. For more information, see the description of theLOG_FILTER setting in Topology Editor log settings on page 16.

    Event ID Event identifier. For more information, see Event IDs on page 22.

    Description Description corresponding to the Event ID.

    Logging configuration and information utilitiesYou can configure logging behavior or use information utilities to helptroubleshoot Tivoli Netcool Performance Manager components.

    DataChannel logs configurationConfiguration options that govern DataChannel logging behavior are set by usingthe Topology Editor and are maintained in the database. Logging behavior can beset at three levels: DataChannel, specific DataChannel components, and allDataChannel components or global'. Logging settings are controlled by the logconfiguration.

    Table 3. DataChannel log configuration components.

    Level Description

    DataChannel Specify logging behavior for DataChannel, including the ChannelManager (CMGR), Channel Name Server (CNS), and ApplicationManager (AMGR). They override any conflicting options set usingGLOBAL. To specify, use the following syntax:LOG.=where = log configuration option (see XREF) = value for the configuration optionExample: LOG.ROOT_DIRECTORY=/opt/datachannel

    DataChannelComponent

    Specify logging behavior for DataChannel components (UBA, FTE,CME, LDR, DLDR). They override any conflicting options set usingGLOBAL. To specify, use the following syntax:...=where = 3- or 4-character string for the component(UBA, CME, FTE, LDR, DLDR)

    = DataChannel number. = Collector number. = Configuration option (see XREF) = ValueExample: CME.2.500.DUAL_LOGGING=true

    Chapter 2. Logs (Wireline Component) 15

  • 8/10/2019 PDF Tnpm Troubleshooting

    20/186

    Table 3. DataChannel log configuration components. (continued)

    Level Description

    Global Specify logging behavior for all DataChannel components. To specify,use the following syntax:GLOBAL.=where

    = Configuration option (see XREF) = ValueExample: GLOBAL.FC_RETENTION_HOURS=48

    Logging configuration changes in Tivoli Netcool Performance Manager can only bemade by the Installer. All components must be restarted to apply the changes. Thesettings include enabling/disabling central and local logging and tracing andchanging the log levels for local/remote logging/tracing. This behavior isconsistent with the previous releases.

    Topology Editor log settingsLog options that can be specified in the Topology Editor are described in the

    following table.Table 4. Log options in Topology Editor

    Option Levels Description

    DUAL_LOGGING GLOBALandComponent

    Use true or false to turn dual logging on or off.When set to false at the GLOBAL level, only DataChannel logs aregenerated. When set to true at the GLOBAL level, individual logs for allDataChannel components are generated, in addition to the DataChannellog. Default is false .Example: UBA.2.500.DUAL_LOGGING= true

    LOG_PORT GLOBAL Port number of the log server for common log and trace files.Example: GLOBAL.LOG_PORT= 25000

    LOG_SERVER GLOBAL Host name of the log server for common log and trace log files.Example: GLOBAL.LOG_SERVER= cme4

    MAX_LOGS GLOBAL Retention period for local trace files in days.Example: CME.1.1. MAX_LOGS= 7

    LOG_PORT GLOBAL Port number to use for logging.Example: GLOBAL.LOG_PORT= 25000

    LOG_SERVER GLOBAL Host name of the log server.Example: GLOBAL.LOG_SERVER= burlington.acme.com

    RENDER_MESSAGE_ARGUMENTS

    GLOBAL Example: CME.1.1.RENDER_MESSAGE_ARGUMENTS= false

    SYSLOG_FACILITY GLOBAL Enter 128 to set the syslog facility to localhost.Example: GLOBAL.SYSLOG_FACILITY=128Note: The syslog daemon must be running locally to have access to thelog host on the network.

    16 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    21/186

    Table 4. Log options in Topology Editor (continued)

    Option Levels Description

    FILTER or LOG_FILTER ALL The event types to log:v F = Failure (a hard process error)v E = Error (process termination or disk space problem)v W = Warning (frequent messages that require no specific action)v I = Information only messagesv 1 = Level 1 debugging informationv 2 = Level 2 debugging informationv 3 = Level 3 debugging information

    Example: UBA.2.500.LOG_FILTER= FE

    LOG_FILE LOG Name of the common log file.Example: LOG.LOG_FILE= /opt/datachannel/log/tnpmlog.log

    LOG_MAX_LOGS LOG Retention period for common log and trace files in days.Example: LOG.LOG_MAX_LOGS= 7

    LOG_RENDER_MESSAGE_ARGUMENTS

    LOG Write message parameters to log file.Example: LOG.LOG_RENDER_MESSAGE_ARGUMENTS= FALSE

    FILE LOG Name and location of the DataChannel log file.Example: LOG.FILE= /opt/datachannel/log/proviso.log

    TRAP_HOST LOG Host to send traps generated from log rules.Example: LOG.TRAP_HOST= 127.0.01

    TRAP_PORT LOG Port to send traps generated from log rules.Example: LOG.TRAP_PORT= 162

    SMTP_HOST LOG Host to send emails generated from log rules.Example: LOG.SMTP_HOST= 127.0.01

    SMTP_PORT LOG Port to send emails generated from log rules.Example: LOG.SMTP_PORT= 162

    SMTP_TO LOG To address for emails generated from log rules.LOG_FORWARD LOG Use true or false to enable/disable syslog forwarding.

    Example: LOG.LOG_FORWARD= false

    LOG_FORWARD_FILTER

    LOG Filter to use for forwarded log messages.Example: LOG.LOG_FORWARD_FILTER= FEWI 123

    LOG_FORWARD_PORT

    LOG UDP port used by the host defined in LOG_FORWARD_SERVER.Example: LOG.LOG_FORWARD_PORT= 514

    LOG_FORWARD_SERVER

    LOG Host name of the syslog server to forward log messages.Example: LOG.LOG_FORWARD_SERVER= localhost

    LOG_TRAP_HOST LOG Host where SNMP traps for specific types of log messages are sent.Example: LOG.LOG_TRAP_HOST= localhostNote: The rules file defining the message types must be installed andloaded.

    LOG_TRAP_PORT LOG Port used by the host defined in LOG_TRAP_HOST.Example: LOG.LOG_TRAP_PORT= 162Note: The rules file defining the message types must be installed andloaded.

    LOG_TRAPS If set to TRUE, traps sent by the CME as the result of threshold violationsare added to the CME log file.Example: CME.1.1.LOG_TRAPS= TRUE

    Chapter 2. Logs (Wireline Component) 17

  • 8/10/2019 PDF Tnpm Troubleshooting

    22/186

    Table 4. Log options in Topology Editor (continued)

    Option Levels Description

    MAX_LOGS ALL Maximum number of days to retain log files.Example: UBA.2.500.MAX_LOGS= 3

    ROOT_DIRECTORY LOG Root directory where DataChannel logs are generated. The logs are locatedin the log directory directly under this root.

    Example: LOG.ROOT_DIRECTORY= /opt/datachannelSUPPRESS_TIMESTAMP_ON_FORWARD

    LOG Suppresses the timestamp.Example: LOG.SUPPRESS_TIMESTAMP_ON_FORWARD= true

    Note: GLOBAL settings are used by all applications. LOG settings areapplication-specific settings that affect only that application. ALL settings can beused as both global and application-specific settings.

    DataView logs configuration

    Note: DataView log messages and configuration options are covered in the IBMTivoli Netcool Performance Manager: DataView User and Administrator Guide,

    under Configuring trace and logging in Chapter 4: Administration tasks.

    statGet utilitystatGet is a utility that is located at each collector and provides DataLoad statisticslike the statistics that are accessible from the Collector Information Tool GUI.statGet can be run on any local server, and is located in the following defaultlocations:v ~/opt/dataload/bin/v ~/opt/datamart/bin/

    Note: For remote systems, you must use the Collector Information Tool . For moreinformation, see the IBM Tivoli Netcool Performance Manager: DataMartOperation Guide.

    SyntaxstatGet [-l {objects|instances|counters|stats|requests}] [-o ]

    [-i ] [-c ] [-D ] [-S ][-P ] [-T ] [-T2 ] [-?] [-v]

    OptionsTable 5. Options for statGet utility

    Option Description

    [-l ] < objectType> is one of the following items:v objects - Lists all main classes of statistics counters.v instances - Lists all instances of a specific statistic class.v counters - Lists all counters names of a specific statistic

    class.v stats - Lists all counters values for a specific class,

    instance, and so on.v requests - List all requests configured inside the scheduler.

    The requests objectType is the default objectType if no -loption is specified.

    18 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    23/186

    Table 5. Options for statGet utility (continued)

    Option Description

    [-o ] < object > is a flag that filters the result set for -l stats and-l instances .

    -i < instance > is a flag that filters the result set for -l stats and-l counters .

    -c is a flag that filters the result set for -l stats .[-D ] is a number from 0 - 6, where 0 specifies no

    debugging, and 6 specifies verbose debugging.

    [-S ] is the name of the server that hosts a specificSNMP collector. If the flag is undefined, the PVM_SSDADDRESSenvironment variable is used. If the environment variable isalso undefined, the value localhost ' is used.

    [-P ] < portNumber> is the number of the listening port number forthe collector. If the flag is undefined, the PVM_SSDPORTenvironment variable is used. If the environment variable isalso undefined, the value 3002 3002 is used.

    [-T ] is the amount of time permitted toestablish a connection before a timeout. If the flag isundefined, a default value of 20 seconds is used.

    [-T2 ] is the amount of time permitted for aconnection response. If the flag is undefined, a default valueof 7200 seconds (2 hours) is used.

    -? Displays the statGet command reference page.

    -v Displays the build version string.

    ExamplesTable 6. Examples of statGet usage

    Use SyntaxDump all pending and current SNMP requests. statGet

    Get all classes of statistics counters. statGet -l objects

    Get all possible instances of the class of counters Targets'.

    statGet -l instances -o Targets

    Get values of all statistics counters of classTargets, for instance _Total'.

    statGet -l stats -o Targets -i _Total

    Get all currently configured requests inDataLoad.

    statGet -l requests

    Configuring trace and loggingThe default logging level can be set by using the configure command. You canmanage logs and trace in the Tivoli Integrated Portal from the WebsphereAdministrative Console from the Settings > Websphere Administrative Consoleoption.

    Chapter 2. Logs (Wireline Component) 19

  • 8/10/2019 PDF Tnpm Troubleshooting

    24/186

    Default logging levelYou use the configure command to configure the Tivoli Integrated Portal logginglevel for Tivoli Netcool Performance Manager packages or components.

    You must restart the application server for your changes to take effect. When youhave restarted the server, the logging level you have selected becomes the default

    logging level.

    Trace logging for DataView

    Note: DataView log messages and configuration options are covered in the IBMTivoli Netcool Performance Manager: DataView User and Administrator Guide,under Configuring trace and logging in Chapter 4: Administration tasks.

    The configure commandConfigures the database connection information and the Tivoli Integrated Portallogging level for the Tivoli Netcool Performance Manager installation.

    You must restart the application server for your changes to take effect. When youhave restarted the server, the logging level you have selected becomes the defaultlogging level.

    Location

    /products/tnpm/dataview/bin

    Syntax

    configure.sh -tipuser -tippassword -type jdbc[-driverhome ] [-jdbcurl ] [-jdbcuser ][-jdbcpassword ]

    configure.sh -tipuser -tippassword -type logging[-level ] [-package ] [-module ]

    configure.sh -tipuser -tippassword -type debug[-state ]

    Parameters

    The Tivoli Integrated Portal installation directory, by default/opt/IBM/tivoli/tipv2 .

    A Tivoli Integrated Portal user name for the local Tivoli Integrated Portal.

    The Tivoli Integrated Portal user password for the local Tivoli IntegratedPortal.

    The three types of configuration options.

    jdbc Configures the JDBC database connection information.

    20 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    25/186

    loggingConfigures the Tivoli Integrated Portal logging level for the TivoliNetcool Performance Manager installation.

    debug Configures the remote debugging.

    Optional parameters

    The JDBC driver location.

    The database URL.

    The database user name.

    The database password.

    Set the level of logging detail: fatal, severe, warning, audit, info, config, detail,fine, finer, finest, or all.

    Set logging for this software package. Wildcards * and ? are supported.

    Set logging for this software component. Wildcards * and ? are supported.

    This is the remote debugging state.

    on The remote debugging state is on.

    off The remote debugging state is off.

    Examples

    The following command sets a Tivoli Integrated Portal logging level of detail . Awildcard selects all of the com.ibm.tivoli.tnpm.dal packages.

    configure.sh -tipuser -tippassword logging-level detail -package com.ibm.tivoli.tnpm.dal.*

    The following command sets the JDBC URL tojdbc:oracle:thin@:host1.company.com:1521:PV :

    configure.sh -tipuser -tippassword jdbc-driverhome "/root/directory/tnpm.dataview" -jdbcurljdbc:oracle:thin@:host1.company.com:1521:PV -jdbcuser -jdbcpassword

    Chapter 2. Logs (Wireline Component) 21

  • 8/10/2019 PDF Tnpm Troubleshooting

    26/186

    TroubleshootingYou can use logs for a number of troubleshooting tasks.v Use DUAL_LOGGING to write component-specific log files in addition to

    proviso.log .v CME BUILD_TREE messages indicate whether Formula Requests are deployed

    and for how many subelements (Debug Level 2).v UBA SCANNED_INPUT, PERF_ACQUIRE_ALL, START_INPUT, and

    METRIC_STREAM_INFO messages indicate if metric input is being retrievedand processed.

    v UBA PERF_INVFLUSH messages indicate when discovered elements andsubelements are written to the database.

    v Use StatGet to obtain SNMP DataLoad information.

    Event IDsFor information about Event IDs that are used in Tivoli Netcool PerformanceManager logs, see Error Messages section in this guide..

    22 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    27/186

    Chapter 3. Contacting IBM support

    IBM Support provides assistance with product defects, answering FAQs, andperforming rediscovery.

    Before you begin

    After trying to find your answer or solution by using other self-help options suchas technical notes, you can contact IBM Support. Before contacting IBM Support,your company must have an active IBM maintenance contract, and you must beauthorized to submit problems to IBM. For information about the types of available support, see the Support portfolio topic in the Software Support Handbook .

    Procedure

    Complete the following steps to contact IBM Support with a problem:

    1. Define the problem, gather background information, and determine the severityof the problem. For more information, see the Getting IBM support topic in theSoftware Support Handbook .

    2. Gather diagnostic information.3. Submit the problem to IBM Support in one of the following ways:

    v Using IBM Support Assistant (ISA):v Online through the IBM Support Portal: You can open, update, and view all

    your Service Requests from the Service Request portlet on the ServiceRequest page.

    v By phone: For the phone number to call in your country, see the Directory of worldwide contacts web page.

    Results

    If the problem that you submit is for a software defect or for missing or inaccuratedocumentation, IBM Support creates an Authorized Program Analysis Report(APAR). The APAR describes the problem in detail. Whenever possible, IBMSupport provides a workaround that you can implement until the APAR isresolved and a fix is delivered. IBM publishes resolved APARs on the IBM Supportwebsite daily, so that other users who experience the same problem can benefitfrom the same resolution.

    Exchanging information with IBM

    To diagnose or identify a problem, you might need to provide IBM Support withdata and information from your system. In other cases, IBM Support mightprovide you with tools or utilities to use for problem determination.

    Copyright IBM Corp. 2011, 2013 23

    http://www14.software.ibm.com/webapp/set2/sas/f/handbook/offerings.htmlhttp://www14.software.ibm.com/webapp/set2/sas/f/handbook/getsupport.htmlhttp://www.ibm.com/software/support/http://www.ibm.com/planetwide/http://www.ibm.com/planetwide/http://www.ibm.com/planetwide/http://www.ibm.com/planetwide/http://www.ibm.com/software/support/http://www14.software.ibm.com/webapp/set2/sas/f/handbook/getsupport.htmlhttp://www14.software.ibm.com/webapp/set2/sas/f/handbook/offerings.html
  • 8/10/2019 PDF Tnpm Troubleshooting

    28/186

  • 8/10/2019 PDF Tnpm Troubleshooting

    29/186

    Chapter 4. Introduction SNMP Inventory

    This chapter provides an introduction to the Tivoli Netcool Performance ManagerSNMP Inventory.

    OverviewTivoli Netcool Performance Manager allows the operator to decide how much theTivoli Netcool Performance Manager DataMart will rely upon the OSS Inventorysystem.

    The Inventory system can be virtually anything from a full-featured commercialInventory package, to an EMS or Node Manager like HP Open View, to a flat filelike /etc/hosts . The minimum required is a list of the IP addresses of resources tomonitor.

    Tivoli Netcool Performance Manager can discover both elements (resources thathave an IP address, such as a router or a switch), and the sub-elements associatedor contained with them, such as an interface or a port.

    Tivoli Netcool Performance Manager supports the following three modes of element and sub-element discovery:

    Mode Inventory ContainsTivoli Netcool PerformanceManager Discovers

    1 Nothing Elements, sub-elements

    2 Elements Sub-elements

    3 Elements, Sub-Elements Nothing

    Most Tivoli Netcool Performance Manager deployments are in mode two. In thismode, Tivoli Netcool Performance Manager imports a list of elements and thenwalks through the MIB to discover the sub-elements. In the first mode, TivoliNetcool Performance Manager sweeps the network to discover the elements andtheir associated sub-elements.

    DiscoveryTivoli Netcool Performance Manager's Discovery capabilities include somepowerful and flexible tools that allow you to determine exactly what Tivoli NetcoolPerformance Manager will monitor, and how the sub-elements will be labeled and

    grouped.These capabilities make it possible to initiate automatically data collection,threshold monitoring, and reporting on discovered elements.

    Using a formula language, Tivoli Netcool Performance Manager can be configuredto walk through an element's MIBs to discover particular MIBs representing users,tunnels, protocols, service classes or other sub-elements. Particular OIDs can beused to automatically create a label for the sub-element.

    Copyright IBM Corp. 2011, 2013 25

  • 8/10/2019 PDF Tnpm Troubleshooting

    30/186

    For example, the sub-element label could be a combination of the element name,the interface, the port and the customer name, all taken from the MIB.

    Metrics and PropertiesIn addition to the identifier of the sub-element and the metrics collected for it,Tivoli Netcool Performance Manager allows the operator to create any number of

    user-defined properties.There are two main differences between metrics and properties. Metrics come froma monitored resource and are used to calculate statistics that are the basis of performance reports and alarm thresholds. Metrics are generally numeric valuesthat change frequently, like the number of packets transmitted or a resource'savailability.

    Properties, by contrast, are values that change less frequently, such as the CIR(committed information rate) or the location of the element. Properties consist of metadata-like identifiers or labels for such things as the customer and/or theservices using a particular sub-element.

    The values for properties can be discovered automatically from the monitoredresource, or they can be imported from Inventory, provisioning or from anotherOSS component.

    Inventory Synchronization and Change ManagementSub-element properties such as the CIR or customer name can change. TivoliNetcool Performance Manager tracks the change and the time of the change, sothat reports are displayed correctly.

    For example, utilization may be calculated against CIR. After the CIR is updated,reports must reflect the new value for utilization calculations. But reports thatshow dates prior to the CIR update must use the old CIR value. Tivoli NetcoolPerformance Manager manages this without error.

    If a sub-element is assigned to a new customer, the customer property will change.If the sub-element is in a particular customer's group, this can cause thesub-element to move to a new group. This can change the collection, alarmthresholds and reporting for that sub-element, automatically.

    Change Management for ElementsThe Inventory must track changes so that continuity of meta-data associated withthe elements can be maintained.

    Unfortunately, Inventory is not as simple as sweeping a range of IP addresses to

    identify the network elements. That is just the beginning of the process. TheInventory must track changes so that continuity of meta-data associated with theelements (such as associations to customers, VPNs and services) can bemaintained. At least one additional challenge remains to keep the elementInventory accurate, as shown with these two problem statements:v IP Address changesv

    Problem: If you are tracking a router by its IP address, and you discover a routerat a new IP address, how do you know if it is a new router, or an existing routerwith a changed IP address?

    26 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    31/186

    Tivoli Netcool Performance Manager solves this problem by associatingadditional properties with each element which provide additional continuity andtrace-ability in the face of IP address changes. These additional properties can bediscovered from the device itself, like SNMP sysName, or gathered externally,like the name resolved from the IP address of the element's managementinterface.

    v Name changesv

    Problem: If you are tracking a router by its name, and you discover that thename has changed, how do you know if it is a new router with that IP address,or an existing router with a changed name?Tivoli Netcool Performance Manager does not track elements by their name orany other single property. Instead, by tracking a combination of properties,Tivoli Netcool Performance Manager is able to provide continuity to inventoryeven when any of these properties change.

    By automatically tracking changes to an element, rather than discovering it as anew element or forcing the operator to manually update the database, TivoliNetcool Performance Manager helps reduce operating costs as follows:v Performance and trend reports for the element show the entire history of the

    element, without interruption.v Changes to the element are shown in historical reports so they can be correlated

    to problems or changes in performance.v Meta-data, such as location, community string, or other properties remains

    associated with the element, saving the operator from having to re-enter this.v Inventory accuracy is improved because the update operation is automatic, not

    manual, eliminating errors.v Inventory accuracy is improved because synchronization is automated,

    eliminating manual delays.

    Change Management for Sub-ElementsIn addition to the challenge of detecting and correctly managing changes onsub-elements, it is important to display this information correctly on reports.

    From an external (customer) point of view, subelement changes should be invisible.From an internal (network operations) perspective, the change must be visible.Tivoli Netcool Performance Manager manages all of this automatically.

    There are many reasons why the identifier (in SNMP, the Object Identifier, or OID)might change for a particular sub-element. Assuming that the sub-element is a portor virtual circuit residing on an interface, some of the changes will be due tofailure and recovery scenarios, or network reconfigurations due to growth:v

    Adding or removing an interface card can cause the SNMP indexes to shift forother sub-elements.v The interface the sub-element resides upon might fail, forcing the service

    associated with the sub-element to be moved to another interface.v The service may be moved to a currently unused sub-element.The service may

    be moved to a sub-element in use, and the service currently on the sub-elementis moved to another sub-element

    Most network changes should be invisible to customers. Their reports shouldreflect the quality of their service, and moves and changes to the network to

    Chapter 4. Introduction SNMP Inventory 27

  • 8/10/2019 PDF Tnpm Troubleshooting

    32/186

    preserve their service should be invisible to them. This is particularly important forSLA reporting. You certainly want to avoid forcing the customer to view tworeports, one for the original NIC and a second report for the replacement NIC.

    Throughout the network changes, network operations and engineering staff musthave an accurate view of the actual sub-elements. For troubleshooting and capacityplanning purposes, they should have a historical view of performance and trafficon a particular port, with information on changes that have occurred.

    Grouping Sub-ElementsProperties can be used to automatically group sub-elements.

    For example, sub-elements can be grouped according to technology, customer,service or site. Groups can be hierarchical, so it is possible to create structures likethe following:v Site/Technology, to see all ATM SVCs in the New York POP.v Customer/Service, to show all of the services a particular customer has

    subscribed to.v Technology/Site, to see which sites are generating the most Frame Relay activity.

    Sub-elements can exist in multiple groups simultaneously. For example, asub-element might be part of a network operations group and a particularcustomer's group.

    Where to Go From HereRelevant information.

    For information on troubleshooting tasks to perform after a new SNMP Inventoryhas been run, see See SNMP Inventory Troubleshooting.

    For information on periodic administrative tasks to perform, see See SNMPInventory Management.

    28 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    33/186

    Chapter 5. SNMP inventory troubleshooting

    This section discusses SNMP Inventory troubleshooting for Tivoli NetcoolPerformance Manager.

    OverviewThe major phases of SNMP inventory.

    The Tivoli Netcool Performance Manager SNMP Inventory consists of thefollowing three major phases, which usually happen sequentially:v SNMP Discovery

    Detects all resources on a target network and creates a virtual image of thenetwork.

    v Synchronization

    Compares the virtual network image generated by the Discovery with therecords in the Tivoli Netcool Performance Manager database that were created by the previous Inventory run. Any modifications (new, missing, or renamedresources, for example) are then synchronized through the application of variousalgorithms, and the new network image is written to the database.

    v GroupingUpdates the grouping structure in the database, which determines the kind of information that is to be collected on each resource, element, sub-element, andso forth.

    In almost all cases, Tivoli Netcool Performance Manager's SNMP Inventoryrequires virtually no operator intervention.

    However, under certain circumstances, problems arise which you will need toaddress. The following sections discuss the more common problems you are likelyto encounter and - where possible -- provide suggestions for remedial actions.

    It is strongly advised that you monitor the logs for potential error messages bydoing one of the following:v Running a Discovery from the command line.

    If you run a Discovery from the command line, redirect STDER to a log file, asfollows:inventory -noX -action discovery -name lowell >output2>error_logFor a complete list of error messages written to the Tivoli Netcool PerformanceManager log file, see Messages section of this guide. For more information onusing the Tivoli Netcool Performance Manager log file, see See Monitoring theLog File.

    v Running a Discovery from the DataMart GUIIf you use the DataMart GUI to initiate a Discovery, error messages will appearon the DataMart GUI > Resource tab > Inventory Tool icon > Live Informationtab.

    Copyright IBM Corp. 2011, 2013 29

  • 8/10/2019 PDF Tnpm Troubleshooting

    34/186

    The Inventory Tool prints out messages like the following every five seconds:2005/12/09 13:46:52 [PL2DBS1, 238 sec, IP done.1/ SNMP done.1/ Elmt0.1.0/ SubElmt 0.0.0]These messages explain the progress of the discovery as follows: IP done.1

    Indicates that the IP phase of the discovery process has completed. SNMP done.1

    Indicates that the SNMP phase of the discovery process has completed. Elmt 0.1.0Indicates that progress of discovered elements, using the following syntax:numberOfObjectsInInputQueue.numberOfThreadsRunning.numberOfElementsDiscovered

    SubElmt 0.0.0Indicates that progress of discovered sub-elements, using the followingsyntax:numberOfObjectsInInputQueue.numberOfThreadsRunning.numberOfSubElementsDiscovered

    If after two minutes there is no change in these messages, the Inventory Tooldisplays a more detailed message like the following:2005/12/09 13:46:57 Current activity @ 2005.12.09-18.46.542005/12/09 13:46:57 Stage: IP done.12005/12/09 13:46:57 Stage: SNMP done.12005/12/09 13:46:57 Stage: Elmt 0.1.02005/12/09 13:46:57 W: R00004/192.168.80.22005/12/09 13:46:57 Stage: SubElmt 0.0.0

    The line that includes the run number and IP address (2005/12/09 13:46:57 W:R00004/192.168.80.2, for example) can be used to troubleshoot possible problems,as explained in See Discovery Seems to Hang or Never Finishes.

    Figure 1. Errors Displayed in the DataMart GUI

    30 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    35/186

    Discovery TroubleshootingThe following sections address the more common problems that arise duringDiscovery.

    Discovery Does Not Start

    The following sections offer the most common solutions to problems withDiscovery not starting.

    Discovery Fails Because Discovery Server Does Not RunWhat do to if the discovery fails.

    About this task

    If the Discovery server fails to start, an error message like the following isreturned:IIOP: couldnt connect to 192.168.68.251:34024: couldnt opensocket: connection refusedError: StartInventory Failed for Discovery Server :IDL:omg.org/CORBA/INTF_REPOS:1.0 {minor 0 completion_statusCOMPLETED_NO}

    To troubleshoot this problem, do the following:

    Procedure1. Log in as pvuser on the system where the channel manager and log server are

    installed.2. Change your working directory to the $DC_HOME/bin directory, by entering

    the following command.Note that $DC_HOME is defined as /opt/datachannel by default.cd $DC_HOME/bin

    3. Verify that the Discovery server is not running by entering the followingcommand:$ dccmd -action status -pattern DISC.*.*If the Discovery server is not running, the dccmd command returns output likethe following:NUMBER FACILITY HOST STATUS ES DURATION EXTENDEDSTATUS1 DISC unresponsive

    What to do next

    ACTION: If the Discovery server is not running, do the following:

    1. Restart the Discovery server by entering a command like the following,specifying the Discovery server for your deployment (in this example we useDISC.DEV19.1 ):dccmd -action bounce -pattern DISC.DEV19.1

    2. Verify that the Discovery server is running by entering the following command:dccmd -action status -pattern DISC.*.*If the Discovery server is running, the dccmd command returns output like thefollowing:

    Chapter 5. SNMP inventory troubleshooting 31

  • 8/10/2019 PDF Tnpm Troubleshooting

    36/186

    NUMBER FACILITY HOST STATUS ES DURATION EXTENDEDSTATUS

    1 DISC DEV19.QUALLA running 1 running

    For more information on using the dccmd command, see the Netcool/ProvisoCommand Line Interface Guide .

    Discovery Fails Because Collector Stops During Discovery

    About this task

    If the collector stops during a Discovery, several different error messages arelogged. The most common error messages are the following:v Error: Aborted at March 14, 2005 10:21:58 pmv Error: Connection refusedv Error: Discovery Server : Status of lowell : invalid CLIENTERR [DC1] R00015

    Connection refused (-I 682 -D 0 -profil lowell -collector dev19.quallaby.com:3002-nbGetIfAddress 100 -invFileTxt /opt/datamart/conf/inventory_subelements.txt -vname {}-intcollector 1)

    To troubleshoot this problem, do the following:

    Procedure1. Log in as pvuser (or the user name that you specified during installation) on the

    system where DataMart is installed.2. (Optional) Ensure that the Oracle database and Listener are running. For more

    information, see the Tivoli Netcool Performance Manager Installation Guide.3. Enter the following command, replacing DATAMART_ROOT with the root

    DataMart directory ( /opt/datamart by default):DATAMART_ROOT/bin/pvm

    The DataMart GUI appears.

    Figure 2. DataMart GUI

    32 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    37/186

  • 8/10/2019 PDF Tnpm Troubleshooting

    38/186

    What to do next

    ACTION: Restart the collector by entering the following command on the Collectorserver:/opt/dataload/bin/pvmdmgr start

    Note: The collector cannot be restarted from the GUI.

    Discovery does not start because inventory is lockedHow to troubleshoot Tivoli Netcool Performance Manager if discovery does notstart because inventory is locked

    About this task

    To troubleshoot this problem, do the following:

    Procedure1. Log in as pvuser (or the user name that you specified during installation) on the

    system where DataMart is installed.2. (Optional) Ensure that the Oracle database and Listener are running. For more

    information, see the Tivoli Netcool Performance Manager Installation Guide.3. Enter the following command, replacing DATAMART_ROOT with the root

    DataMart directory ( /opt/datamart by default):DATAMART_ROOT/bin/pvmThe DataMart GUI appears, as shown in See DataMart GUI.

    4. Click on the Resource tab > Inventory Tool icon. The Inventory Tool appears.

    Figure 4. Stopped Collector

    34 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    39/186

  • 8/10/2019 PDF Tnpm Troubleshooting

    40/186

  • 8/10/2019 PDF Tnpm Troubleshooting

    41/186

  • 8/10/2019 PDF Tnpm Troubleshooting

    42/186

    b. Highlight a profile and then select Edit > Profile or click on the edit icon

    . The Inventory Tool Wizard appears, as shown in See Inventory ToolWizard.

    Figure 7. Inventory Tool Configuration Tab

    38 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    43/186

    c. Click the Next button twice to navigate to the Discovery Tool Wizard, asshown in See Discovery Tool Wizard.

    5. As shown in See Discovery Tool Wizard, check to see if the rejected IPaddresses - either individually or within a specified range - have beenintentionally excluded. If they have not, contact Micromuse support.

    Figure 8. Inventory Tool Wizard

    Figure 9. Discovery Tool Wizard

    Chapter 5. SNMP inventory troubleshooting 39

  • 8/10/2019 PDF Tnpm Troubleshooting

    44/186

    Duplicate Elements Are FoundIf the same element, a router for example, has more than one IP address associatedwith it, Tivoli Netcool Performance Manager will "discover" the element multipletimes, reject the duplicates discoveries, and write warning messages to the log.

    About this task

    See Rejected Duplicate Elements, for example, shows Tivoli Netcool PerformanceManager rejecting three duplicate elements.

    This is expected behavior.

    If these warning messages appear often, Discovery performance may be degraded,since Tivoli Netcool Performance Manager must spend a lot of time calculating andeliminating duplicate elements.

    You may therefore want to exclude the duplicate addresses from the Discovery bydoing the following:

    Procedure1. Invoke the Discovery Tool Wizard (DTW). For instructions on how to invoke

    the DTW, see See Invoke the Discovery Tool Wizard by doing the following:.2. Add the duplicated IP addresses to the IP address exclude area of the DTW, as

    shown in the following figure:

    Figure 10. Rejected Duplicate Elements

    40 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    45/186

    Elements Are Not Identified During DiscoveryIf DataLoad did not receive an SNMP answer using the community nameconfigured in the Tivoli Netcool Performance Manager profile, warning messagesare created.

    About this task

    Warning messages like the following are written to the log and displayed on theLive Information Tab of the Inventory Tool:Warning: Unidentified Agents={192.168.1.201,192.168.1.100,192.168.1.84,192.168.1.75,192.168.1.103,192.168.1.93,192.168.1.98,192.168.1.92,192.168.1.60,192.168.64.103}Warning: Unidentified Agents={192.168.1.48,192.168.64.1,192.168.64.113,192.168.64.106,192.168.127.253,192.168.1.242,192.168.1.181,192.168.1.182,192.168.1.254,192.168.1.62}

    Devices may fail to respond for reasons like the following:v The device is not reachable from DataLoad collector (for example, no network

    route).v The SNMP agent was not started in the device.v Tivoli Netcool Performance Manager has the wrong SNMP community name for

    the device.v The device Access List is preventing Tivoli Netcool Performance Manager

    DataLoad to act as an SNMP Manager for this device.v The firewall configuration is preventing SNMP traffic with the device.

    Figure 11. IP address exclude area of the DTW

    Chapter 5. SNMP inventory troubleshooting 41

  • 8/10/2019 PDF Tnpm Troubleshooting

    46/186

    To troubleshoot this problem, follow these steps:

    Procedure1. Log in as pvuser (or the user name that you specified during installation) on the

    system where DataMart is installed.2. (Optional) Ensure that the Oracle database and Listener are running. For more

    information, see the Tivoli Netcool Performance Manager Installation Guide.3. Change your working directory to DATAMART_ROOT/bin, replacingDATAMART_ROOT with the root DataMart directory ( /opt/datamart bydefault).

    4. Perform an Internet Control Message Protocol (ICMP) query on the device, byentering the following command, replacingIP_ADDRESS_OF_UNIDENTIFIED_DEVICE with the IP address of theunidentified device, and NAME_OF_COLLECTOR_SERVER_DOING_PINGwith the name of the collector server doing the ping:qPing IP_ADDRESS_OF_UNIDENTIFIED_DEVICE -SNAME_OF_COLLECTOR_SERVER_DOING_PINGIf successful, the qPing command returns the IP address of the unidentified

    device, as follows:$ qPing 192.168.68.33 -S dev19192.168.68.33:10

    5. Connect to the device and verify whether or not the SNMP agent is enabledand running.

    6. Connect to the device and verify the community name.7. (Optional) You might have to change the community name in Tivoli Netcool

    Performance Manager, by using the DataMart->Inventory->Discovery Wizard toadd an "alternate" community name to the profile.To change the community name, follow these steps:a. Invoke the Discovery Tool Wizard (DTW). For instructions on how to

    invoke the DTW, see See Invoke the Discovery Tool Wizard by doing thefollowing:.

    b. Add an alternate community name in the specified text box.c. Click the Add button to confirm your choice, as shown in the following

    figure:

    42 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    47/186

    Elements Skipped Because No Related Sub-ElementsWhen Tivoli Netcool Performance Manager discovers an element but cannotdiscover any related sub-elements, warning messages are written to the log anddisplayed on the Information Tab of the Inventory Tool:

    About this task

    The warning messages are like the following:Skipping 3 elements (Set (192.168.66.221_jeffs2192.168.66.221_default 192.168.66.221_jeffs1)) in output file,because they dont have related subElements

    This is not expected behavior, and should be resolved immediately.

    To troubleshoot this problem, follow these steps:

    Procedure1. Log in as pvuser (or the user name that you specified during installation) on the

    system where DataChannel is installed.2. Change your working directory to $DC_HOME/log ( /opt/datachannel/log by

    default), by entering the following command:cd $DC_HOME/log

    3. Enter the following command to search the proviso.log for the Discoveryformula of the skipped element, replacing EL_IP_AD, with the IP address of the skipped element:grep DISC proviso.log | grep R00020 | grep ELS_IP_AD | grepCHKDISCOVERY

    4. If he grep command returns no output.

    Figure 12. Discovery Tool Wizard

    Chapter 5. SNMP inventory troubleshooting 43

  • 8/10/2019 PDF Tnpm Troubleshooting

    48/186

    If the grep command does not return output, the content of the following filesin the $PVMHOME/conf directory ( $PVMHOME is defined as /opt/datamart bydefault):v $PVMHOME/conf/inventory_elements.txtv $PVMHOME/conf/inventory_sub_elements.txtContact Micromuse support.

    5. If the grep command returns output like the following:2005.03.15-18.58.28 UTC DISC.DEV19.1-13308 2CHKDISCOVERY R00020/192.168.68.173. Family:Generic~Agent, trydiscoveryFormula Formula (Basic_Element - 7486 string)2005.03.15-18.58.43 UTC DISC.DEV19.1-13308 2CHKDISCOVERY R00020/192.168.68.173. Family:1213_Device, trydiscoveryFormula Formula (1213_Device - 4851)2005.03.15-18.58.45 UTC DISC.DEV19.1-13308 2CHKDISCOVERY R00020/192.168.68.173. Family:IETF_IF, trydiscoveryFormula Formula (IETF_IF - 9885)The following may be attempted:a. If you believe that the device should in fact respond, try to discover the

    device manually and monitor the trace, using the DataMart->Metric->Formula Editor, as shown in the following figure:

    b. Activate the relevant portion of the device MIB (for example SAA in theCisco router).

    Figure 13. Formula Editor

    44 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    49/186

    c. In some cases, SNMP attributes do not respond properly, and the Discoveryformula for those devices must be changed. In this event, contactMicromuse support.

    Discovery Seems to Hang or Never FinishesIf the Discovery server shows no progress for several minutes it has probably

    encountered problems with a Discovery formula.About this task

    If the Discovery server shows no progress for several minutes (up to thirty minutesin some cases), as shown in the following trace log, it has probably encounteredproblems with a Discovery formula:

    Problems - real or apparent - with Discovery formulas can be the result of thefollowing:v The SNMP agent is slow to respond.v The network latency is very long.v The Discovery formula is not well-suited to the SNMP agent.v Collector performance issues.

    When Tivoli Netcool Performance Manager is in such a state one of two thingsresult:v The Discovery formula finally succeeds and the Inventory continues.v The Discovery hangs until the Inventory timeout occurs (two hours) when the

    following error message is written to the log file:Error: Profile is not progressing during the last 7201 seconds,aborting

    Contact Micromuse support and provide them with the DATACHANNEL_ROOT/datachannel/log/[yyyy.mm.dd]SNMP.log

    However, before contacting Micromuse support, and before the Discovery finishesor times out, do the following:

    Procedure1. Determine which Discovery formula is causing the problem by doing the

    following:

    Figure 14. Discovery formula

    Chapter 5. SNMP inventory troubleshooting 45

  • 8/10/2019 PDF Tnpm Troubleshooting

    50/186

    a. Log in as pvuser (or the user name that you specified during installation) onthe system where DataChannel is installed.

    b. Change your working directory to DATACHANNEL_ROOT/log, replacingDATACHANNEL_ROOT with the root DataMart directory ( /opt/datachannel by default).

    c. Enter the following command to search the proviso.log for the Discovery

    formula:grep DISC proviso.log | grep R00010 | tail -1

    2005.03.15-20.32.16 UTC DISC.PVDEMO2.1-1048 2CHKDISCOVERY R00010/172.31.0.51. Family:Cisco_CBQoS_Action, trydiscoveryFormula Formula (Cisco_CBQoS_Action - 7784)In this example, the Discovery formula is Cisco_CBQoS_Action.

    2. Determine the internal request ID that is stuck by doing the following:a. Log in as pvuser (or the user name that you specified during installation) on

    the system where the DataMart or DataLoad server is installed.b. Change your working directory to DATAMART_ROOT/log, replacing

    DATAMART_ROOT with the root DataMart directory ( /opt/datamart bydefault).

    c. Enter the following command to find the internal request ID, replacingSERVER_NAME with the name of the Tivoli Netcool Performance ManagerDataMart server:statGet -S SERVER_NAME| grep -i once

    + [33] ID 96627,{CAL none (ONCE)(next=2005/03/15 20:32:16)}(P3)ACTIVE (LastExec):ServiceForm:(Trgt=(string)172.31.0.51)(Form=(form)Cisco_CBQoS_Action)(Inst=)(RComm=p

    + [40] ID 96683,{CAL none (ONCE)(next=2005/03/15 20:48:19)}(P1)ACTIVE (LastExec): ServiceSTAT (LONG)In this example, the internal request ID is 96627.Important: If the command returns no output, contact Micromuse support.

    3. Enable limited debugging on the collector server in order to populate the logwith additional information by doing the following:a. Change your working directory to /opt/dataload/contribs .b. Enter the following command, replacing TASKID with the Internal request

    ID you found in See Determine the internal request ID that is stuck bydoing the following:.dialogTest2 Debug 6.TASKID

    Tivoli Netcool Performance Manager outputs a message like the following:Set debug level to 6 for taskId 96627.

    Debug configuration:> Global Level= 1; Mask=FW> ID 96627 Level= 6; Mask=FWI1234WARNING: DO NOT USE THE GLOBAL COLLECTOR DEBUGGER.

    46 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    51/186

    Synchronization TroubleshootingThe following sections address the more common problems that arise duringSynchronization.

    Synchronization (Elements)

    During the synchronization phase, Tivoli Netcool Performance Manager comparesthe virtual network image generated by the Discovery with the records in theTivoli Netcool Performance Manager database that were created by the previousInventory run.

    Any modifications (new, missing, or renamed resources, for example) are thensynchronized through the application of various algorithms, and the new networkimage is written to the database.

    In order to track elements and sub-elements through subsequent inventories, TivoliNetcool Performance Manager identifies them with a unique, never-changingidentifier, called an invariant.

    The default element invariant, for example, is a concatenation of the followingthree attributes:v The MIB II sysNamev The first IP address responding to the ICMP scan (in Inventory mode 1) or the

    one given in the the mode 2 Inventory filev The fist valid physAddress (MAC address) in the MIB II ifTable

    See Synchronization Invariant Logic illustrates the logic used to determine if a"newly-discovered" element is actually new or is an existing element that hasmoved or changed.

    Synchronization Invariant Logic

    Element Attributesin Database

    MIB II sysName IP Address physAddress New Element If ...

    Defined Defined Defined Two or threeattributes havechanged.

    Empty Defined Defined Two attributes havechanged.

    Empty Empty Empty The resolved name isdifferent fromresolved name in the

    database.

    If an element does not respond to one of these three MIB II attributes, contactMicromuse support.

    To avoid synchronization errors, we strongly recommend that you limit deviceconfiguration changes to one attribute between two runs of an Inventory(Discovery and Synchronization).

    Chapter 5. SNMP inventory troubleshooting 47

  • 8/10/2019 PDF Tnpm Troubleshooting

    52/186

    Pre-synchronization SummaryAs Tivoli Netcool Performance Manager prepares for synchronization, it compiles alist of elements

    The list contains the following types of elements:v Newv

    Updatedv Burnedv Unchangedv Not Foundv Histo Reject

    See Pre-synchronization Summary shows the display written to theDataMart->Inventory Tool->Live Information tab, with this summary informationhighlighted:

    The following sections discuss this summary in more detail.

    Identifying not found elementsIf the Pre-Synchronization summary lists any contact Micromuse support.

    Figure 15. Pre-synchronization Summary

    48 IBM Tivoli Netcool Performance Manager: Troubleshooting Guide

  • 8/10/2019 PDF Tnpm Troubleshooting

    53/186

    An example of not found elements can be found in See Pre-synchronizationSummary, which contains three examples.

    Identifying burned elementsHow to identify burned elements

    In cases where the MIB II sysName, IP address, and physAddress are all defined foran element (the first row in See Synchronization Invariant Logic ), Tivoli NetcoolPerformance Manager may encounter a special situation if the MIB II sysName and phyAddress have changed, but the IP address remains unchanged.

    In this special case, Tivoli Netcool Performance Manager creates a new elementand writes it to the database. However, since the IP address for "both" elements isthe same, so too is the resolved name, which means that the elements would beseen by Tivoli Netcool Performance Manager as the same element.

    To prevent this situation, Tivoli Netcool Performance Manager renames the initialelment with a "Burned" prefix, as shown in See Burned Element:

    When Synchronization completes, a list of burned elements is written to thefollowing file:PROFILE_HOME/PROFILE_NAME/synchro/e_burned.dat

    e_burned.dat file

    If you have an e_burned.dat file, contact Micromuse support.

    The burned elements and their sub-elements are duplicated, and the statisticsattached to these resources are not continuous in reports.

    Preventing Burned ElementsTivoli Netcool Performance Manager relies quite heavily on the sysName whencalculating invariants. When using sysName you need to prevent burned elements.

    Figure 16. Burned Element

    Chapter 5. SNMP inventory troubleshooting 49

  • 8/10/2019 PDF Tnpm Troubleshooting

    54/186

    About this task

    To prevent burned elements, so the following:

    Procedure1. Do NOT change the sysName on a device when the lowest physAddress has also

    changed.2. Ensure that the sysName is defined and not left empty.3. Ensure that the sysName is defined with a unique value.4. Minimize any changes to the sysName once they have been set, unless you want

    Tivoli Netcool Performance Manager to see the device as new and interrupt thecontinuity of statistics.

    Detecting Too Many New ElementsIf there has been extensive configuration changes before an Inventory run (forexample, if two attributes were changed simultaneously on a router), TivoliNetcool Performance Manager will discover and create a new element in thedatabase.

    About this task

    The consequences are severe:v The element and its sub-elements are duplicated.v The statist