OPC_TSG OC3_12_48

download OPC_TSG OC3_12_48

If you can't read please download the document

description

OPC Troubleshooting Guide

Transcript of OPC_TSG OC3_12_48

  • OPC Troubleshooting Guidefor

    OC-3/OC-12 TBM Rel 13OC-48 Rel 14.10

    Security Notice: The information disclosed herein is property of Nortel or others and isnot to be used by or disclosed to unauthorized persons without the written consent ofNortel. The recipient of this document shall respect the security status of theinformation.

    FIBER WORLD

    Editor: Ross BrydonDate: Aug. 11, 1998Issue: AD05

  • OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Issue / Change HistoryIssue Date Reason Change AuthorAA01 July 18, 1997 Created the PLS

    book version ofthe internal

    trouble shootingguide -

    opctsg_i.aa01

    wpit.24 Wayne Pitman

    AB01 July 25, 1997 Added problemtext to SRT

    section

    heald.1 Alex Balaban

    AB02 July 29, 1997 Updated SRTsection.

    heald.3 VladimirMilutinovic

    AB03 August 5, 1997 Addedinformation to

    the CUA section.

    heald.5 Greg Lamarre

    AB04 August 7, 1997 Addedinformation to

    the LMR section.

    olga.8 Olga Beskid-Wojcicka

    AB05 August 11, 1997 Editing changes. heald.6 Colette HealdAB06 Sept. 12, 1997 Added the

    stpprov toChapter 16

    CJT.21 Chris Todd

    AB07 Sept. 17, 1997 Included TL1and rmopcld

    sections. Closedall final actions

    and incorporatedthe final

    comments.

    esami.23 Eric Sami

    AB08 Sept, 28, 1997 Update to thesection Atools

    and 06brm

    esami.24 Eric Sami

    AB09 Nov, 11, 1997 IncludedChapter 18 andsome other finalchanges to the

    document

    rbrydon.1 Ross Brydon

    AB10 Nov. 25, 1997 Approvedversion of this

    document.

    rbrydon.2 Ross Brydon

    AC01 May 26, 1998 Add CUAchanges

    CMRAGHU.8 RaghunathMohanrao

    AC02 June 22, 1998 Add connectionservices updates

    BROOMH.1 Hugh Broomfield

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13))Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    TSG Versions and the corresponding stream and OPC release.

    AC03 July 14, 1998 Add ESWD toTL1 section

    VEENA.7 Madhuri Veena

    AC04 July 14, 1998 Add ESWD toTL1 section

    VEENA.8 Madhuri Veena

    AC05 July 14, 1998 Add ESWD toTL1 section

    VEENA.9 Madhuri Veena

    AC06 July 20, 1998 Add TL1 over 7layer problems

    SHAKTI.1 Shakti Thakur

    AC07 July 21, 1998 Updating RHEdocument from

    changes inexternal review.

    VEENA.10 Madhuri Veena

    AC08 July 23, 1998 Final formattingrevisions for Rel14.10 OPC TSG.

    RBRYDON.4 Ross Brydon

    AD01 July 28 1998 New stream forOC12 Rel 13

    TroubleshootingGuide.

    RBRYDON.5 Ross Brydon

    AD02 August 6 1998 OPC Support forInServiceTimeslot

    Rollowver onOC12

    GREGG.95 Greg Gnaedinger

    AD03 August 6 1998 OC12 Rel 13matched nodeprovisioning.

    CXINP.13 Cindy Pu

    AD04 August 11 1998 Final formattingrevisions forOC12 Rel 13

    TSG and includePSU section from

    Jim Forbes.

    RBRYDON.6 Ross Brydon

    AD05 Augus 28, 1998 Multiple updates WPIT.28 Wayne Pitman

    Stream OC-48 Release OC-12 Release TSG VersionOPC 15, 16 Rel 10 10.0

    OPC 17, 18, 20, 22, 23 Rel 11 11.0OPC 24 Rel 13 13.0

    OPC30-36 Rel 14 14.0

    Issue Date Reason Change Author

  • OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Document Editor:Ross Brydon 1B43

    Document Location:This document is stored in pls fwpdoc under the document name opctsg_i, uncontrolledcopies and copies of this document for pervious releases can be found on local file serversunder /opcserv/common/operations/dept_docs/OPC_tsg.Document Approval:This is an internal document updated by the designers which follows the informal process,therefore no approval is necessary. The updated sections are cua, tl1, stp, and Atools..Purpose of DocumentThe OPC Troubleshooting Guide, is intended as a means of assistance with solvingcommon technical problems, which may arise during the operation of the OPC. It also listspotential defect PRSs which are pertinent to the particular release. Each chapter of thedocument is written by a relevant OPC designer and deals with specific aspects of OPCoperation. The chapter is updated if necessary, with each OPC software release. THISDOCUMENT IS NOT INTENDED FOR CUSTOMER USE.SummaryThe OPC Troubleshooting Guide represents the amalgamation of common problems andwork-arounds encountered through the course of testing and working with OPC36,OPC38, and OPC45. This guide is not meant to be an exhaustive representation of allproblems but only those which have occurred most frequently during OPC operation anduse.

    All problems outlined in this document are accompanied by one or more fault reasons andsolutions. Generally the solutions are embedded in the reason text but if a common work-around is available then references to that work-around will be used instead. Throughoutthe document, there are solutions which refer to Contact the appropriate OPC supportauthority, in these cases gather as much data as possible and contact the OPC CustomerSupport Staff.Target AudienceThe OPC Troubleshooting Guide is meant for general use by the TransportNode ProductSupport Teams. It is assumed that the reader is familiar with the basic operation of theOPC, the UNIX environment and the OPC designer test tools. THIS DOCUMENT ISNOT INTENDED FOR CUSTOMER USE.Document HighlightsThis document contains:

    OPC38 Rel 14.10 14.10OPC45 Rel 13.00 14.10 / 13.00

    Stream OC-48 Release OC-12 Release TSG Version

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13))Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    Standard TOC- lists all problems and solutions, ordered as they are presented in thechapters

    The chapters describe common OPC problems. Each problem will be described andpossible diagnostic reasons will be provided. Available work-arounds will be provided atthe end of each problem reason description.

    All Problem and Solution Titles are formatted in the following manner: :

    where:paragraph number= unique numerical identifier of all problems and solutions in this

    documentsubject = basic category of the problem or solutions, valid subjects are:

    problem/solution = a brief single sentence description of the problem or solution

    BRM CAM CNET CUA DISK DLMEthernet GUI H/W LAPB LAS MC68302

    MIB NEA NUM OAM ODS OPCOSI OWS ROA SCF SCM SRTSTP SWI TAPE TBOS TELNET TL1USM VCP VT100 X.25 X.3 XNTP

  • OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    For Nortel Internal Use OnlyFor Nortel Internal Use Only

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) -1 Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    1 - OPC Login and Start Up1.1.1 VT100: Login Prompt Missing ........................................................................ 1.1.1-11.1.2 VT100: Login Wont Respond......................................................................... 1.1.2-11.1.3 VT100: Removing Port B Doesnt Automatically Logout. ............................. 1.1.3-21.1.4 VT100: Password Automatically Rejected ...................................................... 1.1.4-21.1.5 USM: Cannot Login......................................................................................... 1.1.5-21.1.6 USM: Root Password Not Valid ...................................................................... 1.1.6-31.1.7 USM: opcui: command not found.................................................................... 1.1.7-41.1.8 USM: UI is Garbled ......................................................................................... 1.1.8-41.1.9 USM: UI Wont Start ....................................................................................... 1.1.9-41.1.10 USM: Critical System Resource Unavailable .............................................. 1.1.10-61.1.11 USM: Excessively Slow............................................................................... 1.1.11-61.1.12 USM: Keeps Closing.................................................................................... 1.1.12-71.1.13 USM: Tools Wont Open.............................................................................. 1.1.13-71.1.14 USM: Tools Unavailable .............................................................................. 1.1.14-81.1.15 USM: There are no toolsets defined for this user......................................... 1.1.15-81.1.16 USM: Failed to retrieve user profile............................................................. 1.1.16-91.1.17 Telnet: Telnet Connection not running......................................................... 1.1.17-91.1.18 GUI: Login Window Not Available ........................................................... 1.1.18-101.1.19 GUI: Text doesnt fit in Window................................................................ 1.1.19-101.2.1 VT100: Reconfiguring Port B to Terminal..................................................... 1.2.1-111.2.2 VT100: Port B Cable Pinouts......................................................................... 1.2.2-121.2.3 VT100: Port B settings................................................................................... 1.2.3-121.2.4 USM: Using the wall and write commands .............................................. 1.2.4-131.2.5 GUI: Setting Up An Xterminal for an OPC................................................... 1.2.5-141.2.6 EtherNet: OPC EtherNet Connector Pinout................................................... 1.2.6-15

    2 - OPC Base Operations2.1.1 OAM: NE Indicates an OPC OAM S/W Failure ............................................. 2.1.1-12.1.2 OWS: Both Primary and Backup OPCs are Active ......................................... 2.1.2-22.1.3 OWS: Primary OPC is Inactive........................................................................ 2.1.3-22.1.4 OWS: Backup OPC Wont Go Active.............................................................. 2.1.4-32.1.5 ODS: Data Synchronization Fails .................................................................... 2.1.5-32.1.6 ODS: Want to Data Sync from Backup to Primary.......................................... 2.1.6-42.1.7 CAM: Associations are Down or are Unstable ................................................ 2.1.7-42.1.8 OPC: OPC is Not Communicating................................................................... 2.1.8-52.1.9 OPC: OPCCLEAN is Not Running ................................................................. 2.1.9-62.1.10 OPC: OPC is Continuously Rebooting ........................................................ 2.1.10-62.1.11 OWS: OWS_SWACT Doesnt Work ........................................................... 2.1.11-72.2.1 OPC: Booting from Tape ................................................................................. 2.2.1-82.2.2 OSI: Reconstructing an OPCs Serial Number ................................................ 2.2.2-9

    3 - OPC Hardware3.1.1 H/W: ELAN Fail is Lit..................................................................................... 3.1.1-13.1.2 H/W: CNET Fail .............................................................................................. 3.1.2-13.1.3 H/W: Active is not lit ....................................................................................... 3.1.3-23.1.4 H/W: Unit Fail is lit.......................................................................................... 3.1.4-23.1.5 TAPE: Amber Light is on................................................................................. 3.1.5-33.1.6 TAPE: Amber Light is Flashing Rapidly ......................................................... 3.1.6-33.1.7 TAPE: Green Light is on.................................................................................. 3.1.7-33.1.8 TAPE: Green Light is Flashing Slowly............................................................ 3.1.8-43.1.9 TAPE: Green Light is Flashing Slowly and the Amber Light is on................. 3.1.9-43.1.10 TAPE: Green Light is Flashing Rapidly ...................................................... 3.1.10-43.1.11 TAPE: Tape Wont Eject .............................................................................. 3.1.11-5

  • -2 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    For Nortel Internal Use OnlyFor Nortel Internal Use Only

    3.1.12 TAPE: RBNCLEAN is Not Running ........................................................... 3.1.12-53.1.13 TAPE: Tape Drive Cleaning Alarm.............................................................. 3.1.13-53.1.14 BAD DISK: SCANDISK/KLS is Not Running ........................................... 3.1.14-53.1.15 BAD DISK: Disk Bad Media Alarm............................................................ 3.1.15-63.1.16 HARDRIVE INDICATOR LIGHT: flashing ............................................... 3.1.16-63.2.1 CNET: Using tstatc to evaluate CNET............................................................. 3.2.1-6

    4 - NE Software Download4.1.1 DLM: Reboot/Load Manager is not Downloading an NE ............................... 4.1.1-14.1.2 DLM: Reboot/Load Manager Wont Start ....................................................... 4.1.2-34.1.3 DLM: Reboot/Load Manager is not displaying any NEs................................. 4.1.3-44.1.4 DLM: NE is Continuously Rebooting.............................................................. 4.1.4-44.1.5 DLM: NE is Frozen Immediately After a Reboot............................................ 4.1.5-54.1.6 DLM: Reboot/Load Manager indicates Fail .................................................... 4.1.6-54.1.7 DLM: Load Processor Activity is Disabled ..................................................... 4.1.7-64.1.8 DLM: NE shelf processor firmware load is Corrupt or Incomplete................. 4.1.8-64.1.9 DLM: NE shelf processor application load is Corrupt or Incomplete ............. 4.1.9-74.2.1 DLM: NE serial number is corrupted............................................................... 4.2.1-74.2.2 DLM: Removing Release From the Backup OPC ........................................... 4.2.2-8

    5 - Installations & Upgrades5.1.1 SWI: Backup OPC is still active after start_backout completes ...................... 5.1.1-15.1.2 SWI: OPC Not Functioning After Installation................................................. 5.1.2-15.1.3 SWI: Installation is Failing During Validation................................................ 5.1.3-15.1.4 SWI: Installation is Failing During Transfer.................................................... 5.1.4-25.1.5 NUM: Both Primary and Backup OPCs are Active ......................................... 5.1.5-35.1.6 RBN: Disk 95% Full Alarm ............................................................................. 5.1.6-3

    6 - Backup/Restore Manager6.1.1 BRM: NE Backup is Not Working................................................................... 6.1.1-16.1.2 BRM: NE Restore is Not Working................................................................... 6.1.2-26.1.3 BRM: Backup OPC is Handling the NE Database Requests ........................... 6.1.3-36.1.4 BRM: NE Database Backups Incomplete or Corrupted................................... 6.1.4-4

    7 - OPC Save And Restore7.1.1 SRT: Save to Tape is Failing............................................................................. 7.1.1-17.1.2 SRT: Restore from Tape is Failing ................................................................... 7.1.2-27.1.3 SRT: Critical Files are Missing After a Restore ............................................... 7.1.3-27.2.1 SRT: Files saved by Save and Restore Tool ..................................................... 7.2.1-37.2.2 SRT: Unable to Resore OPC Data from Disk .................................................. 7.2.2-4

    8 - Commissioning Manager8.1.1 SCF: Cant Enable Clear Commissioning Button............................................ 8.1.1-18.1.2 SCF: Cant Commission a New NE ................................................................. 8.1.2-18.1.3 SCF: Error Message - This OPC contains invalid data .................................... 8.1.3-28.1.4 ODS: Data Synchronization is Failing ............................................................. 8.1.4-28.1.5 SCF: NE Release is Set to NONE.................................................................... 8.1.5-28.1.6 SCF: Cannot Edit the Commissioned NE ........................................................ 8.1.6-28.2.1 SCF: Replacing A Backup OPC....................................................................... 8.2.1-38.2.2 SCF: Replacing A Primary OPC...................................................................... 8.2.2-38.2.3 SCF: Clearing Commissioning Data ................................................................ 8.2.3-38.2.4 SCF: Dumping All Commissioning Data to File ............................................. 8.2.4-3

    9 - Network Surveillance9.1.1 LAS: Network Surv Tools Display ? Symbol .................................................. 9.1.1-1

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) -3 Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    9.1.2 LAS: OPC Alarm View Doesnt Match NE Alarm View ................................ 9.1.2-19.1.3 LAS: CMT Status Line Doesnt Match the NE Alarm Banner ....................... 9.1.3-29.1.4 LAS: Network Surv Tools Dont Display Newly Added NEs or NE Names .. 9.1.4-29.1.5 LAS: Event Browser Filter Settings Changed.................................................. 9.1.5-39.1.6 LAS: Alarm Monitor Isnt Displaying All Alarms .......................................... 9.1.6-39.1.7 RBN: RBNDISK is Not Running .................................................................... 9.1.7-39.1.8 RBN: Disk 95% Full Alarm............................................................................ 9.1.8-49.2.1 LAS: Dumping Contents of the LAS Database ............................................... 9.2.1-4

    10 - NE Login10.1.1 NEA: NEs are missing ................................................................................. 10.1.1-110.1.2 NEA: More NEs Displayed Than Commissioned ....................................... 10.1.2-110.1.3 NEA: Duplicate NEs Error Message ........................................................... 10.1.3-110.1.4 NEA: NE Access is very slow...................................................................... 10.1.4-110.1.5 NEA: NE Access is not available................................................................. 10.1.5-210.1.6 NEA: NE Login Manager will not Start ...................................................... 10.1.6-310.1.7 NEA: Cannot Auto-Login to NE from OPC................................................ 10.1.7-310.2.1 NEA: Using nelogin to Access NEs ............................................................ 10.2.1-4

    11 - Remote Telemetry11.1.1 TBOS: No NE Listed in Remote Telemetry Tool ........................................ 11.1.1-111.1.2 TBOS: NEs Missing on Remote Telemetry Tool......................................... 11.1.2-111.1.3 TBOS: TBOS Display Screen IS Frozen ..................................................... 11.1.3-111.1.4 TBOS: Serial Telemetry for Remote Display is Incorrect ........................... 11.1.4-211.1.5 TBOS: Parallel Telemetry for Remote Display is Incorrect ........................ 11.1.5-211.1.6 TBOS: Data Selector for Monitor Display is Disabled................................ 11.1.6-311.1.7 TBOS: Remote Telemetry Tool Wont Open on Active OPC...................... 11.1.7-311.1.8 TBOS: Source NE and Source Display Unknown....................................... 11.1.8-411.1.9 TBOS: Maximum Number of Display Mappings Reached ......................... 11.1.9-411.1.10 TBOS: Position ID Field is Invalid .......................................................... 11.1.10-411.1.11 TBOS: Monitored Source Field is Invalid ............................................... 11.1.11-511.1.12 TBOS: Monitored Source Name Doesnt Correspond to Source ID ....... 11.1.12-511.1.13 TBOS: Display Field is Empty or Invalid ................................................ 11.1.13-511.1.14 TBOS: Cannot Remove Display Mapping............................................... 11.1.14-511.1.15 TBOS: System Generated Time Out........................................................ 11.1.15-611.1.16 TBOS: Association is Down Between OPC and NE ............................... 11.1.16-611.1.17 TBOS: Display is Already Mapped ......................................................... 11.1.17-611.1.18 TBOS: Maximum Number of Mappings Exceeded................................. 11.1.18-711.1.19 TBOS: Display Mapped to its Source NE is Not Allowed ...................... 11.1.19-7

    12 - Remote OPC Login12.1.1 ROA: Remote OPC Login is Not Available................................................. 12.1.1-112.1.2 ROA: OPCs are missing............................................................................... 12.1.2-112.1.3 ROA: More OPCs Displayed Than Commissioned ..................................... 12.1.3-112.1.4 ROA: Duplicate OPCs Error Message ......................................................... 12.1.4-212.1.5 ROA: OPC Access is very slow ................................................................... 12.1.5-212.1.6 ROA: Remote OPC Login to other OPCs Not Available............................. 12.1.6-212.2.1 ROA: Using nelogin to Access OPCs .......................................................... 12.2.1-3

    13 - Centralized Security13.1.1 CUA: Users Cannot Login after Upgrade .................................................... 13.1.1-113.1.2 CUA: Users Cannot Login ........................................................................... 13.1.2-113.1.3 CUA: NE User Class Different Than Indicated ........................................... 13.1.3-213.1.4 CUA: Userid is Disabled, But User Can Still Login.................................... 13.1.4-213.1.5 CUA: Userid is Disabled, But User Gets Wrong Error Message................. 13.1.5-3

  • -4 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    For Nortel Internal Use OnlyFor Nortel Internal Use Only

    13.1.6 CUA: Cannot Login to Newly Commissioned NEs..................................... 13.1.6-313.1.7 CUA: Cannot Login to NEs Which Have Been Restarted or Rebooted ...... 13.1.7-413.1.8 CUA: Userid shows Assigned/Expired, But the Password Was Changed ... 13.1.8-413.1.9 CUA: Root Password Was Forgotten ........................................................... 13.1.9-413.2.1 CUA: Verifying the Contents of the Password File...................................... 13.2.1-413.2.2 CUA: Verifying the Contents of the Group File.......................................... 13.2.2-613.2.3 CUA: Recovering the Root Password ......................................................... 13.2.3-7

    14 - TL-1 and X.2514.1.1 TL1: TID/SID Name Rejected ..................................................................... 14.1.1-114.1.2 TL1: RETRIEVE PM Responds with DENY .............................................. 14.1.2-114.1.3 TL1: Response Messages are Lost or Not Complete ................................... 14.1.3-114.1.4 TL1: Missing PM Counts ............................................................................. 14.1.4-214.1.5 TL1: Call Accepted Logs in the Event Browser .......................................... 14.1.5-214.1.6 TL1: Call Terminated Logs in the Event Browser ....................................... 14.1.6-214.1.7 TL1: Call Rejected Logs in the Event Browser............................................ 14.1.7-314.1.8 TL1: Port not configured for X.25 Logs in the Event Browser.................... 14.1.8-314.1.9 TL1: Cannot Establish Connection .............................................................. 14.1.9-314.1.10 TL1: Commands Arent Being Performed............................................... 14.1.10-514.1.11 TL1: Autonomous Messaging Isnt Working........................................... 14.1.11-514.1.12 LAPB: LAPB is Dropping ....................................................................... 14.1.12-714.1.13 LAPB: LAPB Problems ........................................................................... 14.1.13-714.1.14 MC68302: MC68302 Problems ............................................................... 14.1.14-814.1.15 X.3PAD: X.3 PAD Cant Establish Connection....................................... 14.1.15-914.2.1 TL1: Setting Up X.25, VCP and X.3PAD.................................................. 14.2.1-1014.2.2 TL1: Determining Inhibited PM Counts .................................................... 14.2.2-1214.2.3 X.25: Default Settings Stored in X25INIT_TEMPLATE File ................... 14.2.3-1214.2.4 X.3: Default Settings Stored in x3config File ............................................ 14.2.4-1314.2.5 VCP: VCP PID Defaults ............................................................................ 14.2.5-1414.2.6 TL1 Interface Router Service : Cannot Establish Connection ................... 14.2.6-1414.2.7 TL1 configuartion for TL1 Over TCP/IP : Error during Configuring and deleting theconfiguration.......................................................................................................... 14.2.7-1614.2.8 STA-ESWD is rejected and ESWD cannot be started................................ 14.2.8-1614.2.9 STA - ESWD accepted, ESWD initiated and then aborted ..................... 14.2.9-1914.2.10 CANC - ESWD rejected ....................................................................... 14.2.10-2014.2.11 Association cannot be established over 7 Layers .................................. 14.2.11-2114.2.12 Association Established through ACT-USER and then dropped .......... 14.2.12-2314.2.13 Association not Dropped by CANC-USER ........................................... 14.2.13-24

    15 - Configuration Manager15.1.1 SCM: Cannot Save Configuration Data to NEs ........................................... 15.1.1-115.1.2 SCM: Cannot Send Configuration Data to NE............................................. 15.1.2-115.1.3 SCM: Cannot Remove a Configuration........................................................ 15.1.3-115.1.4 SCM: Configuration Manager Doesnt Start................................................ 15.1.4-215.1.5 SCM: Scheduled Configuration Audit Fails................................................. 15.1.5-315.1.6 SCM: Scheduled Configuration Audit Mismatch ........................................ 15.1.6-315.2.1 SCM: Retrieving Configuration and Connection Data................................. 15.2.1-4

    16 - Connection Manager16.1.1 STP: Cannot Send Connection Data to NEs ................................................ 16.1.1-116.1.2 STP: Connection Manager Doesnt Start ..................................................... 16.1.2-116.1.3 STP: Unable to provision connections on an active Primary OPC .............. 16.1.3-216.1.4 STP: Unable to provision connections on an active Backup OPC............... 16.1.4-316.1.5 STP: Connection Audit Fails........................................................................ 16.1.5-3

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) -5 Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    16.1.6 STP: Audit Mismatch................................................................................... 16.1.6-416.1.7 STP: No option to correct an audit mismatch .............................................. 16.1.7-516.1.8 STP: Cannot Add a Matched Node Connection .......................................... 16.1.8-516.1.9 STP: Unable to add a nodal cross-connect................................................... 16.1.9-616.1.10 STP: Unable to delete connections .......................................................... 16.1.10-716.1.11 STP: Unable to provision a DCP connection........................................... 16.1.11-716.1.12 STP: Bandwidth unavailable.................................................................... 16.1.12-816.1.13 STP: Uni-directional Nodal Cross-Connects present on TBM OC-12 .... 16.1.13-816.2.1 STP: Retrieving Configuration and Connection Data .................................. 16.2.1-816.2.2 STP: Viewing the Protection State of a Matched Node Ring ...................... 16.2.2-916.2.3 STP: Viewing Mismatched Connection Data .............................................. 16.2.3-916.2.4 STP: Correcting individual mismatches..................................................... 16.2.4-10

    17 - PM Collection17.1.1 PM Counts not reported in TL1 ................................................................... 17.1.1-117.1.2 TL1 RTRV-PM command retrieves no data ................................................. 17.1.2-217.1.3 Daily counts on OPC do not match daily counts on the NE. ....................... 17.1.3-317.1.4 PM Collection exceeded 15 minutes............................................................ 17.1.4-4

    18 - Network Time Protocol18.1.1 XNTP: The OPC has entered freerun mode................................................. 18.1.1-1

    19 - Protection Manager / 1:N19.1.1 PSU: Cant Display Configuration............................................................... 19.1.1-119.1.2 PSU: Dumping All Protection Data to File.................................................. 19.1.2-1

    20 - OPC Date / OPC Shutdown20.1.1 SSD: Time Zone Missing from Date UI ...................................................... 20.1.1-1

    A - Complete set of OPC Tools

  • -6 OPC Trouble Shooting Guide (OC-48 Rel. 14.1, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    For Nortel Internal Use OnlyFor Nortel Internal Use Only

  • Chapter

    Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-1Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    1 OPC Login and Start Up1.1 Problem DescriptionThe following sections describe common login and user interface problems. Each problemwill be described and possible diagnostic reasons will be provided. Available work-arounds will be provided at the end of each problem reason description.

    1.1.1 VT100: Login Prompt MissingAfter an OPC shutdown the VT100 connected to Port B shows all the proper OPCdiagnostics but the login prompt is not available.

    Reason-1After an OPC shutdown, the Port B is always started off as a terminal to display theOPC diagnostic results. After the diagnostics, Port B is initialized to the valueindicated by the PortConfiguration tool. Use this tool to query the Port Bconfiguration.

    See VT100: Reconfiguring Port B to Terminal on page 1-11.

    Reason-2Using the diskinit tape to boot the OPC causes the Port B to initialize as a terminalwith full modem line control. For this reason, a full RS-232 Null Modem cable isrequired.

    See VT100: Port B Cable Pinouts on page 1-12.

    1.1.2 VT100: Login Wont RespondDuring normal operation, the Port B connection suddenly stops working.

    Reason-1Port B has been changed to something other than terminal. Use thePortConfiguration tool to query the Port B configuration.

    See VT100: Reconfiguring Port B to Terminal on page 1-11.

    Reason-2VT100 settings changed. The OPC Port B cannot auto-baud down lower than 1200baud.

    See VT100: Port B settings on page 1-12.

  • 1-2 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-3VT100 settings changed. The OPC Port B cannot auto-baud up from a lower baudrate. Remove and re-connect the Port B cable to drop the terminal connection. Ifremoving the cable does not correct the problem then re-configure the Port B.

    See VT100: Reconfiguring Port B to Terminal on page 1-11.

    Reason-4Port B cable is broken. Test the Port B cable.

    See VT100: Port B Cable Pinouts on page 1-12.

    Reason-5Port B is frozen. This is extremely rare and can be fixed by unconfiguring Port Band re-configuring Port B as terminal.

    1.1.3 VT100: Removing Port B Doesnt Automatically Logout.Removing the Port B cable does not logout the session.

    Reason-1The cable was disconnected at the terminal end. Always disconnect the cable fromthe OPC end. The OPC detects cable removal by an open circuit on the DTR. If thecable used has an internal loopback then the OPC will not know if the cable ismissing.

    Reason-2Port B is frozen. This is extremely rare and can be fixed by unconfiguring Port Band re-configuring Port B as terminal.

    1.1.4 VT100: Password Automatically RejectedWhen attempting to login, the userid is accepted and the password is rejected before anycharacters are entered.

    Reason-1VT100 settings changed. The new line option on the VT100 is enabled. Disablethe new line option.

    See VT100: Port B settings on page 1-12.

    1.1.5 USM: Cannot LoginWhen logging into the OPC the userid and password are rejected.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-3 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    Reason-1Userid does not exist or has been disabled. This can occur if the Centralized UserAdministration tool is in use by another user or if a restore from tape has beenperformed. Login to the OPC as admin and use the Centralized UserAdministration tool to add/enable the userid.

    See SRT: Files saved by Save and Restore Tool on page 7-3.See USM: Cannot Login on page 1-2.See USM: Root Password Not Valid on page 1-3.

    Reason-2Password is not correct. This can occur if the Centralized User Administrationor Password tools are in use by another user or if a restore from tape has beenperformed. Use the old user password or login to the OPC as admin and use theCentralized User Administration tool to change the password.

    See SRT: Files saved by Save and Restore Tool on page 7-3.See USM: Cannot Login on page 1-2.See USM: Root Password Not Valid on page 1-3.

    Reason-3Too many users already logged in or too many processes (tools) are running. Loginto the OPC as root and enter the who command to see who is logged into theOPC. The wall or write command can be used to send messages to other usersasking to log off or close tools.

    See USM: Using the wall and write commands on page 1-13.

    1.1.6 USM: Root Password Not Valid

    When logging into the OPC as root, the password is rejected.

    Reason-1Password is not correct. This can occur if a restore from tape has beenperformed. Use the old root password.

    See SRT: Files saved by Save and Restore Tool on page 7-3.

    Reason-2Password is not correct. This can occur if the Password tool is in use by anotheruser or if the UNIX passwd command is use. Ask the network administrator for thenew password.

  • 1-4 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-3THIS IS NOT INTENDED FOR THE FIELD APPLICATION

    Forgot the valid password. Use the diskinit tape to boot from tape, mount theharddrive and edit the password file directly.The passwd file is critical to the operation of the OPC, inadvertent changes mayresult in an OPC corruption.

    1.1.7 USM: opcui: command not foundAs a root user, this error message opcui: command not found is usually displayedwhen executing opcui at an OPC prompt.

    Reason-1The opcui command cannot be performed from within a UNIX shell alreadyrunning under a user session (CMT and GUI).

    Reason-2The opcui command cannot be performed when the OPC is in the process of beingupgraded.

    Reason-3The appropriate alias has not yet been set. To determine if the alias has been set,enter alias | grep opcui at the command line. Assure the.login file contains thefollowing line:

    alias opcui unsetenv DISPLAY;/iws/usm/usmstartThe.login file is critical to the operation of the OPC, inadvertent changes mayresult in an OPC corruption.

    1.1.8 USM: UI is GarbledUser Session was working fine, later the screen columns and rows are garbled. The data iscorrect but displayed in the wrong places.

    Reason-1The terminal session has been corrupted. Issue the /usr/bin/reset tool to reset theterminal session.

    1.1.9 USM: UI Wont StartManaged to login in to the OPC but the User Session Manager will not start.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-5 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    Reason-1Another user session manager cannot be started from within a UNIX shell alreadyrunning under a user session (CMT and GUI).

    Reason-2User has logged in too soon after an OPC power up. Wait a few minutes and try tologin again.

    Reason-3Too many users already logged in or too many processes (tools) are running. Loginto the OPC as root and enter the who command to see who is logged into theOPC. The wall or write command can be used to send messages to other usersasking to log off or close tools.

    See USM: Using the wall and write commands on page 1-13.

    Reason-4Process is growing out of control. Perform a view /var/log/syslog and enter?LIMIT_REACHED to search for process limits exceeded. The process nir maybe consuming excessive amounts of CPU time. Kill the nir process if necessary.

    Reason-5Possible corruption of the /etc/group file. The group file contains informationcorrelating group level privileges and associated userids. Assure all useridscorrespond to the proper groups and assure all groups exist. The group file iscritical to the operation of the OPC, inadvertent changes may result in an OPCcorruption.

    See CUA: Verifying the Contents of the Group File on page 13-6.

    Reason-6The owner of the /home/ has been changed or incorrectly set. The ownershould be the same as the . Example list of the userid maint:

    The maint userid has a directory named maint. it is owned by maint and ispart of the slat group.

    Reason-7Possible corruption of the /etc/passwd file. The passwd file contains informationcorrelating startup shells and privileges. Assure all userids correspond to theproper shells and the privileges are set correctly. The passwd file is critical to theoperation of the OPC, inadvertent changes may result in an OPC corruption.

    opc> ll /home/maintdrwxrwxr-x 2 maint slat 1024 May 27 15:20 maint

    Privilege Owner Group Directory

  • 1-6 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    See CUA: Verifying the Contents of the Password File on page 13-4.

    Reason-8Possibly the CUA database has been corrupted and the available toolsets aremissing. Use the /iws/opcdb/opcdbtst tool to determine the state of the OPC. Thisfailure sometimes manifests itself by displaying the Critical System ResourceUnavailable error dialogue when trying to start the USM session.

    Reason-9OPC has failed the sanity check.

    1.1.10 USM: Critical System Resource UnavailableThe error message Critical System Resource Unavailable usually indicates that an MSRthat an application needs is not available. This error message is only displayed when theparticular tool is invoked.

    Reason-1OPC is not commissioned. Open the Commissioning Manager and commissionthe system.

    Reason-2An MSR has been manually busied. Open the drmstat tool. Check for MSRswhich are in the manualbusy:systemterminate state. Determine why the MSR hasbeen busied and return to service the MSR if possible.

    Reason-3The OPC is on the verge of performing an automatic or manual shutdown. Duringthe shutdown process all MSRs are brought down.

    Reason-4The CUA database has been corrupted and the following error message isdisplayed, Information Failed to retrieve user profile information due to theunavailability of a critical system resource

    See USM: Failed to retrieve user profile on page 1-9.See USM: There are no toolsets defined for this user on page 1-8.

    1.1.11 USM: Excessively SlowThe USM is running but everything is very slow.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-7 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    Reason-1Too many users already logged in or too many processes (tools) are running. Loginto the OPC as root and enter the who command to see who is logged into theOPC. The wall or write command can be used to send messages to other usersasking to log off or close tools.

    See USM: Using the wall and write commands on page 1-13.

    Reason-2A process is growing out of control. In some rare cases a process is not operatingproperly and will start consuming increasing amounts of CPU time. To determineif a process is growing, enter ps -ef | more and record the highest CPU times andthe associated processes. Repeat the process a number of times. If a process seemsto be growing, contact the appropriate OPC support authority.

    1.1.12 USM: Keeps ClosingThe USM will start up properly and OPC will suddenly close.

    Reason-1The OPC is being shutdown by another user. Login to the OPC as root and enterthe who command to see who is logged into the OPC. The wall or write commandcan be used to send messages to other users asking to log off or close tools.

    See USM: Using the wall and write commands on page 1-13.

    Reason-2A critical MSR has caused the OPC to shutdown.

    Reason-3The file system is corrupted.

    Reason-4The hard disk is corrupted.

    1.1.13 USM: Tools Wont OpenThe UI starts up properly but the selected tools wont open.

    Reason-1The tool is already open by another user. Login to the OPC as root and enter thewho command to see who is logged into the OPC. The wall or write command canbe used to send messages to other users asking to log off or close tools.

    See USM: Using the wall and write commands on page 1-13.

  • 1-8 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-2The tool is hung or another user cannot close the tool. Determine the subsystemname of the tool. Login to the OPC as root. Locate the PID of the lct or xtinterface of the subsystem and kill it.

    1.1.14 USM: Tools UnavailableThe UI starts up properly but some or all tools are missing.

    Reason-1The user class does not support the tools expected. Login as a user that supportsthe desired tools or login as admin user and open the Centralized UserAdministration tool to view the toolsets of the user and group.

    Reason-2Possible corruption of the /etc/group file.

    See USM: UI Wont Start on page 1-4.

    Reason-3The Centralized User Administration database is corrupted. Rebuilding the CUAdatabase can be performed but it will result in a complete loss of all newuserids and groups and cause all default users to revert to their defaultpasswords.The OPC database is critical to the operation of the OPC, inadvertent changes mayresult in an OPC corruption. Contact the appropriate OPC support authority

    Reason-4Two or more TL1 sessions have dropped simultaneously. If more than 1 TL1sessions are running on the OPC and those sessions drop for whatever reason thenthere is a 25% chance that the database semaphore will be killed resulting inunstable database accesses. View the /var/log/syslog file. Move to the end of thefile by typing G and search for the opcdb message by typing? opcdb. Around thetime the opcdb error is detected, if a LAPB is DOWN message is seen then thethere is high probability that the TL1 dropping problem has killed off the databasesemaphore. Re-initialize the opcdb MSR or reboot the OPC.

    See LAPB: LAPB is Dropping on page 14-7

    1.1.15 USM: There are no toolsets defined for this userThe error message Information There are no toolsets defined for this user usuallyindicates that the Centralized User Administration database has been improperly reset.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-9 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    Reason-1During a database reset the Centralized User Administration database wascorrupted.

    Reason-2Two or more TL1 sessions have dropped simultaneously. If more than 1 TL1sessions are running on the OPC and those sessions drop for whatever reason thenthere is a 25% chance that the database semaphore will be killed resulting inunstable database accesses. View the /var/log/syslog file. Move to the end of thefile by typing G and search for the opcdb message by typing? opcdb. Around thetime the opcdb error is detected, if a LAPB is DOWN message is seen then thethere is high probability that the TL1 dropping problem has killed off the databasesemaphore. Re-initialize the opcdb MSR or reboot the OPC.

    See LAPB: LAPB is Dropping on page 14-7

    1.1.16 USM: Failed to retrieve user profileThe error message Information Failed to retrieve user profile information due to theunavailability of a critical system resource usually indicates that the Centralized UserAdministration database has been improperly reset.

    Reason-1During a database reset the Centralized User Administration database wascorrupted.

    Reason-2Two or more TL1 sessions have dropped simultaneously. If more than 1 TL1sessions are running on the OPC and those sessions drop for whatever reason thenthere is a 25% chance that the database semaphore will be killed resulting inunstable database accesses. View the /var/log/syslog file. Move to the end of thefile by typing G and search for the opcdb message by typing? opcdb. Around thetime the opcdb error is detected, if a LAPB is DOWN message is seen then thethere is high probability that the TL1 dropping problem has killed off the databasesemaphore. Re-initialize the opcdb MSR or reboot the OPC.

    See LAPB: LAPB is Dropping on page 14-7

    1.1.17 Telnet: Telnet Connection not runningTelnet to the OPC is not working or is working too slowly.

    Reason-1The EtherNet port is not turned on. Assure that the hosts, netlinkrc and rc files arecorrectly edited. If not, use the /iws/lan/ether_admin tool to initialize/enable the

  • 1-10 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    port. After using the ether_admin tool, you may be prompted to reboot orshutdown the OPC, NEVER enter reboot or shutdown at the OPC prompt,instead use the OPC Shutdown tool from the USM.

    Reason-2The EtherNet connection is very noisy or congested. Use the ping command todetermine the health of the EtherNet connection. In some cases, a ping with 1000byte packets may be necessary to expose weaknesses in the LAN connection. Ifpackets are being lost or are very slow to echo back then the LAN is very noisyand/or congested.

    opc> ping primary 1000 3PING primary: 1000 byte packets1000 bytes from 47.105.7.9: icmp_seq=0. time=17. ms1000 bytes from 47.105.7.9: icmp_seq=1. time=10. ms1000 bytes from 47.105.7.9: icmp_seq=2. time=15. ms

    Reason-3The EtherNet connection is down. Check the EtherNet drop and connector andassure that it is properly connected. Contact the local LAN administrator to see ifthere any problems with the LAN. Use the ping or tstatc command to determine ifEtherNet connectivity is established.

    1.1.18 GUI: Login Window Not AvailableThe login window for the GUI session will not appear on the xterminal.

    Reason-1The EtherNet connection is not working properly.

    See Telnet: Telnet Connection not running on page 1-9.

    Reason-2The xterminal is not set up properly.

    See GUI: Setting Up An Xterminal for an OPC on page 1-14.

    1.1.19 GUI: Text doesnt fit in WindowSentences, titles and names are too large or too small for the GUI window allocated.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-11 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    Reason-1The fonts used by the GUI are dependent on the font server specified in theconfiguration parameters of the workstation or xterminal.For a workstation, it is normal for the fonts to be the incorrect size, as theworkstation will use its own fonts to drive the GUI display. Workstation GUI to anOPC is not supported.For an xterminal, the font server and possibly the configuration server settings areincorrect. The fonts and configuration server should be set to the OPC IP address.

    See GUI: Setting Up An Xterminal for an OPC on page 1-14.

    1.2 Solution Description

    1.2.1 VT100: Reconfiguring Port B to TerminalThe following procedure will outline the steps needed to unconfigure services on Port Band then reconfigure Port B as a terminal. A detailed procedure can be found in the NTP,Volume: Operations, Administration and Provisioning, Section: System AdministrationProcedures.1) Open the PortConfiguration tool from the USM or enter PortConfiguration tool at

    the opc prompt2) Select item 3, (Unconfigure a service)3) Select item 1, to unconfigure the service4) Repeat step 3, until there are no services configured on Port B5) Select item 8, (Return to Main menu)6) Select item 2, (Configure a service)7) Select item 1, (Terminal)8) Select item 8, (Return to Main menu)9) Select item 1, (Query Port Configuration)10) Assure the port is configured as a terminal11) Hit to return to the main menu12) Select item 9, (Exit)

  • 1-12 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    1.2.2 VT100: Port B Cable PinoutsThe following table outlines the pinouts of the various cables that can connect to Port B.The cable connector to Port B is always a male DB9. For more information, refer to theNTP, Volume: Installation, Section: Installing Peripheral Cables.

    1.2.3 VT100: Port B settingsThe following table outlines the VT100 settings required to connect a VT100 terminal toPort B.

    TABLE 1.

    OPC Port B toAsynchronous DCE

    (MODEM)

    OPC Port B toSynchronous DCE

    (X25)

    OPC Port B toAsynchronous DTE

    (VT100)

    OPC Port B toAsynchronous DTE

    (LAPTOP)OPC(DB9)

    DCE(DB25)

    OPC(DB9)

    DCE(DB25)

    OPC(DB9)

    DTE(DB25)

    OPC(DB9)

    DTE(DB9)

    1 8 1 17 1 4 1 -2 3 2 3 2 2 2 33 2 3 2 3 3 3 24 20 4 20 4 5, 6 4 15 7 5 7 5 7 5 56 6 6 15 6, 8 20 6 47 4 7 4 7 8 7 88 5 8 5 - - 8 7

    TABLE 2.

    VT100 Terminal Settings

    Variable Setting Commentmode VT100 no other is supportedbaud 1200 - 9600 auto-bauds down onlybits 8 bit

    parity nonestop 1

    Xon/Xoff enabledauto new line off

    duplex full (no local echo)autowrap off

    scroll jump smooth can also be usedcolumns 80controls interpret

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-13 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    1.2.4 USM: Using the wall and write commandsThe wall and write command are used to send messages to other users on the same OPC.wall stands for write all, and is used to broadcast messages to all users presently loggedonto the same OPC. write is used to send messages to a specified user logged onto thesame OPC.To use wall:1) From an OPC prompt, enter wall2) Enter the message, end each line of the message with a

    NOTE: wall permits multiple line messages to be sent3) When the message is completed, hit CTRL-D to send it

    NOTE:- wall can send messages to all OPC CMT sessions- wall cannot send messages to the GUI session except for the GUI console window- To abort the message, hit CTRL-C

    Example Usage:opc> wall Hello World How are you

    To use write:1) From an OPC prompt, enter who2) Determine the user and serial device you want to write to3) Enter write 4) Enter the message, one line at a time followed by a

    NOTE: the line is sent when the is pressed5) When the message is completed, hit CTRL-C to terminate the write

    NOTE: write can send messages to all OPC CMT sessions write cannot send messages to the GUI session except for the GUI console

    window

    Example Usage:opc> write admin pty/ttyu0 Hello World How are you

  • 1-14 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 1 OPC Login and Start Up For Nortel Internal Use OnlyFor Nortel Internal Use Only

    1.2.5 GUI: Setting Up An Xterminal for an OPCThis procedure already assumes that the /iws/lan/ether_admin command has alreadybeen run to enable the EtherNet port. The following steps are meant only to configure anxterminal to communicate with an OPC.NOTE: Setting up an Xterminal for a Network Manager is different than setting up anXterminal for an OPC

    For NCD xterminals only:1) Login to the OPC as the root user, enter /iws/lan/ether_admin2) Select item 3 (X terminals configuration)3) Add the appropriate data (boot version, xterminal address and boot server)4) Save data5) Continue with Generic xterminal setup

    Generic xterminal setup:Refer to your xterminal documentation on how to set up the font and configurationservers.

    1) Open the setup window for your xterminal2) Set the server values (font, configuration and/or display manager) to match the OPCs

    IP addressUseful information to have: font server: (OPC ip address) X file server: (OPC ip address) configuration server: (OPC ip address) boot server: (appropriate workstation ip address or PROM) name server: (appropriate workstation ip address or none) font paths:

    /iws/X11/fonts /iws/X11/lib/fonts/misc The following fonts are NCD xterminal specific: /iws/X11/lib/ncd/fonts/misc /iws/X11/lib/ncd/fonts/100dpi /iws/X11/lib/ncd/fonts/75dpi

    3) Save all settings4) Reboot the xterminal

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 1-15 Editor: Ross Brydon Issue: AD04

    Chapter 1 OPC Login and Start UpFor Nortel Internal Use Only

    1.2.6 EtherNet: OPC EtherNet Connector PinoutThe following is the pinout of the OPC EtherNet port.Note: The connector is not standard 10baseT (EMI restrictions). It is suggested to use theNT suggested EtherNet cable to comply with EMI guidelines.

    6 54 32 1

    Pin 1Pin 2Pin 5Pin 6

    +-

    -

    +

    TXTXRXRX

  • Chapter

    Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 2-1Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    2 OPC Base Operations2.1 Problem DescriptionThe following sections describe common OPC base functionality problems. Each problemwill be described and possible diagnostic reasons will be provided. Available work-arounds will be provided at the end of each problem reason description.

    2.1.1 OAM: NE Indicates an OPC OAM S/W FailureThe NE containing the OPC is indicating a minor alarm for an OPC OAM SoftwareFailure. The NE uses the OWS process of the OPC to determine the state of the OPC. Ifthe OWS process is not running for any reason then the OAM failure is activated.

    Reason-1The OPC is in the middle of an upgrade or an install. While an OPC is beingupgraded, it is normal for this alarm to become active.

    Reason-2The OPC is in the middle of the restore process from either a restore from tape or adata sync. While an OPC is being restored, it is normal for this alarm to be active.

    Reason-3The OWS process has been man-busied. Use the drmstat command to investigatethe state of the MSRs. Re-initialize the MSR, if required.

    Reason-4The OPC is in the process of an OPC Shutdown. Wait for the OPC to shutdownand restart.

    Reason-5The MBIF communication between OPC and NE is slow or failing. To assureMBIF communication is working, open the tstatc tool and select the k option toview MBIF statistics. Assure that the RX and TX failed packets are zero and theRX and TX packets are equal. Packets from the NE should arrive every minute.

    Reason-6If there is no k option in the tstatc then the kernel does not support MBIFcommunication. Assure that the load running actually supports OPC OAMSoftware Failure.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 2-2 Editor: Ross Brydon Issue: AD04

    Chapter 2 OPC Base OperationsFor Nortel Internal Use Only

    2.1.2 OWS: Both Primary and Backup OPCs are ActiveThe Primary and Backup OPCs are both active and have been active for a period greaterthan five minutes.

    Reason-1There is a network partition. Use osiping to verify the continuity between theOPCs. Assure the associations to all NEs are available.

    See CAM: Associations are Down or are Unstable on page 2-30.A network partition can occur because: an optical fibre/CNET break occurs an NE is being downloaded an NE has dropped into debug modeThere is little that can be done for a fibre/CNET break but to wait until the break isrepaired. If an NE is downloading, login to both OPCs and open the Reboot/LoadManager to monitor the progress of the NE download. If an NE has dropped intodebug mode then it will be necessary to login to the debug port at the NE and issuethe go command. It may be necessary to issue the go command several times to getthe NE out of debug mode.

    Reason-2The Backup OPC is not commissioned or the wrong OPC is commissioned asBackup. Open the Commissioning Manager tool to verify that the proper OPChas been commissioned.

    Reason-3A Primary or Backup OPC has been recently replaced and/or commissioned.Whenever there is a change to the commissioning data it is recommended todatasync the changes and to shutdown the OPC (use the OPC Shutdown tool).

    Reason-4The ows process is not working. Under normal operation the Primary and BackupOPCs are always exchanging messages. Use tstatc to monitor the packets beingsent between the OPCs. If there is no ows process running or if the ows outgoingmessages are not increasing then the ows MSR must be re-initialized.

    Reason-5If both Primary and backup OPCs are found Active in situations other than thosementioned above, look for /iws/ows/ows_swact.record file on both the Primaryand Backup OPCs. Delete this file from both OPCs (or from whichever OPC itexists), busy and return to service warmstandby (OWS) MSR on both of theOPCs. Primary and Backup OPCs will return to a normal state.

    2.1.3 OWS: Primary OPC is InactiveThe Primary OPC is inactive and the Backup OPC is active.

  • 2-3 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 2 OPC Base Operations For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-1The Primary OPC is just recovering after a shutdown. Wait for the OPC to fullyrecover, in about 5 minutes and the Primary OPC will regain control of thenetwork.

    Reason-2The ows_swact tool was used to switch the OPC activities. Use ows_swact -a onthe Primary OPC or ows_swact -i on the Backup OPC to revert the OPC activities.

    Reason-3The Primary and Backup OPC were being swapped and the process was notperformed correctly. Refer to the NTP for OPC module replacement.

    2.1.4 OWS: Backup OPC Wont Go ActiveThe Backup OPC will not go active even when Primary OPC is shutdown.

    Reason-1The ows_swact tool was used to lock the Backup OPC inactive. Use ows_swact -ron the Primary or Backup OPC to release the lock.

    Reason-2A data sync was not performed from the primary. Verify the commissioninginformation.

    2.1.5 ODS: Data Synchronization FailsOPC data synchronization is failing.

    Reason-1The connection from Primary to Backup is down. Use osiping to verify thecontinuity between the OPCs.

    See CAM: Associations are Down or are Unstable on page 2-30.

    Reason-2Directory structures are missing at either the Primary or Backup OPCs. ODSrequires the directory /iws/ods/odstmp be on the Primary and the directory /users/VFS/users/opcods be on the Backup.

    Reason-3The time between the Primary and Backup OPC has been changed using the UNIXdate command. The OPC Date tool must be used to change the dates on the OPC.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 2-4 Editor: Ross Brydon Issue: AD04

    Chapter 2 OPC Base OperationsFor Nortel Internal Use Only

    To correct the problem, re-align the Backup date to match the Primary date, usingthe OPC date tool command.

    Reason-4Different loads are running on OPCs. Put identical loads on OPCs and then do adata sync.

    Reason-5Read, write or execute permissions for one of the directories /users/VFS, /users/VFS/users or /users/VFS/users/opcds is not there. Give permissions to thesedirectories wherever they are missing.

    2.1.6 ODS: Want to Data Sync from Backup to PrimaryRequest from customers. Normally a restore from tape using the Save and Restore toolwould be the advisable method of restoring data.

    See ODS: Data Sync from Backup to Primary on page 2-34.

    2.1.7 CAM: Associations are Down or are UnstableThe only reliable means of determining if an NE is no longer communicating is to useosiping from an OPC or clping and coping from an NE. osiping, clping and coping senda number of test patterns to the specified target. If the target is responding properly thenthe target will echo the patterns back. For more information about osiping, see OPCTools on page ,A-137.

    Reason-1Fibre break. When a fibre break occurs and datacom cannot get routed to NEs viaother means then associations will drop. If possible, login to both the Primary andBackup OPC and perform an osiping broadcast at both OPCs. A broadcast sendstest patterns to all NEs and OPCs on the network. The resulting data can becorrelated to determine where the fibre break is and which nodes are affected.Note: broadcast sends a large number of messages throughout the network, whichmay degrade network performance. If this occurs target the NEs individually.

    Reason-2An NE is downloading. When an NE is downloading, it is unable to providerouting services for datacom to other NEs, thus causing a network partition at theNE. To quickly determine if an OPC is downloading an NE, use the tstatc tool tolook for outgoing and incoming messages being sent to the dls process and the NE.

  • 2-5 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 2 OPC Base Operations For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-3During network upgrades using NUM, it is normal to see only a partial SOCbecause both the primary and backup OPCs are sharing the SOC.

    Reason-4The NE has been re-commissioned and the NEid has been changed from itsprevious value. The OPC uses the unique NEid to identify an NE. If the NEid haschanged on the OPC (re-commissioned) but not at the NE (requires a reboot) thenthe associations to that NE will be down. Use nnsmon -d to view all visible NEson the network. Re-commission the NEid back to its original value. It may take upto 5 minutes to re-establish the associations.

    Reason-5The NE is experiencing trouble with the CNET or SDCC. Login to the NE. Assurethe CNET and SDCC alarms are not masked. Check the NE UI alarm screen.Correct the CNET and SDCC alarms at the NE. At the NE UI, enter cnet readstat,mlapd readstats 0 and mlapd readstats 1, Frame Fragments or Errors greaterthan 0 (zero) indicate a possible noisy CNET cable or port. Replace the hardware.

    Reason-6The NE comm ports are down. At the NE UI, enter the ports all command toretrieve a summary of all CNET, SDCC and LAPD ports. If the CNET or SDCCport is OOS or off, login to the NE from an hmi port and use the facomm;ports;portprov;chgstate command to enable the comm port. If the port isIStbl then there is a hardware fault. Verify the fault by putting the port OOS andthen IS. If the problem persists, return the SP to Nortel.

    Reason-7There is another NE with the same NEid in the network. Use nnsmon -d to locatethe duplicate NE. Change the NEids of the duplicate NEs.

    Reason-8The backup OPC is active and has taken control of the SOC.

    See OWS: Both Primary and Backup OPCs are Active on page 2-28.

    Reason-9There are too many NEs on the CNET.

    See H/W: CNET Fail on page 3-37.

    2.1.8 OPC: OPC is Not CommunicatingThe OPC is not communicating over CNET, SONET, EtherNet, Port B. There is no accessto the debug port of the OPC.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 2-6 Editor: Ross Brydon Issue: AD04

    Chapter 2 OPC Base OperationsFor Nortel Internal Use Only

    Reason-1The user logged out near the end of an OPC shutdown without confirming the finalshutdown prompt. The work-around is removing and re-seating the OPC. Beforeremoving the OPC, assure the harddisk is not spinning by listening to the OPC.

    Reason-2An OPC shutdown was performed while the OPC Save and Restore tool wasperforming a restore operation. The work-around is removing and re-seating theOPC. Before removing the OPC, assure the harddisk is not spinning by listening tothe OPC.

    Reason-3After an installation or an upgrade, the /etc/reboot command fails to reboot theOPC properly. This is very rare and can be remedied by removing and re-seatingthe OPC. Before removing the OPC, assure the harddisk is not spinning bylistening to the OPC.

    Reason-4The file system is corrupted.

    Reason-5The hard disk is corrupted.

    2.1.9 OPC: OPCCLEAN is Not RunningIf OPCCLEAN is not running there is a chance that the OPC disk will become full. Thereshould be events recorded in the /var/log/syslog file indicating that OPCCLEAN was runeach night.

    Reason-1The opcclean script is not available or not executable. Perform a ll /iws/opcfiles/opcclean. If the script is not found, retrieve a copy from another OPC or extractthe file from tape. If the permissions are not set properly change it with the chmodcommand.

    2.1.10 OPC: OPC is Continuously RebootingThe OPC is not recovering from its reboot cycle. The symptoms will be a continuouslyrunning diagnostic cycle. A general troubleshooting hint is to read the diagnosticmessages being displayed as this will provide valuable information in determining the rootcause of the failure.

    Reason-1The 1 Hz clock has been enabled on an OPC running firmware releasesOPCREL02 or 01. There is a bug in firmware releases older than OPCREL03 such

  • 2-7 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 2 OPC Base Operations For Nortel Internal Use OnlyFor Nortel Internal Use Only

    that the OPC will remain in the reboot cycle indefinitely when the 1 Hz clock isenabled. To determine what the firmware release of the OPC is:

    The work-around is to pull the OPC out of the shelf and re-seat it. This will causethe 1 Hz clock to turn off and the OPC will recover from the reboot. The 1 Hzclock must then be turned off from the OPC Date tool.

    Reason-2There has been a file system corruption. This usually means the OPC has managedto get through the diagnostic cycle but it unable to run the kernel. If this is the case,there is very little that can be done except to re-initialize the disk and re-install theload.

    Reason-3There has been a disk media failure. Nothing can be done if this is the case, exceptto re-initialize the disk and re-install the load.

    Reason-4There has been a hardware failure. This can be determined from the diagnosticreadout. The OPC must be returned to Nortel.

    2.1.11 OWS: OWS_SWACT Doesnt WorkThe ows_swact command is not working.

    Reason-1The backup OPC is not commissioned. The ows_swact command requires abackup be commissioned. Commission a backup OPC.

    Reason-2Communications to the backup OPC is down. Assure communication to thebackup OPC and NEs are available. USe osiping and tstatc to check thecommunication paths.

    See CAM: Associations are Down or are Unstable on page 2-30.See OPC Tools on page A-137.

    Reason-3The ows_swact communication protocol is confused between the old and the newsoftware releases. view the /usr/adm/swilog file and go to the end of the file byhitting G. The log message will indicate the failure occurred during the

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 2-8 Editor: Ross Brydon Issue: AD04

    Chapter 2 OPC Base OperationsFor Nortel Internal Use Only

    ows_disable_swact. Quit the file by entering :q!. Use tstatc to assure ows ismessaging properly to the backup OPC. Perform ows_swact, if thecommunication path is bad then there is an ows_swact protocol problem.The workaround is to busy the warmstandby MSR on both the primary andbackup OPCs at the same time. Assure both OPCs are OOS, then Return to Servicethe warmstandby MSR on both the OPCs at the same time. Perform ows_swactand assure the communication path is okay. Resume the NUM in progress.

    See OPC Tools on page A-137.

    Reason-4NUM has not initialized ows_swact properly. view the /usr/adm/swilog file andgo to the end of the file by hitting G. The log message will indicate the failureoccurred during the ows_disable_swact. Quit the file by entering :q!. Use tstatc toassure ows is messaging properly to the backup OPC. Perform ows_swact, if thepeer OPC is not commissioned then ows_swact did not initialize properly.Shutdown the primary OPC to correct the problem. After the OPC recovers fromthe shutdown, perform ows_swact and assure the communication path is okay.Resume the NUM in progress.

    2.2 Solution Description

    2.2.1 OPC: Booting from TapeIn order to perform a boot from tape, the following criteria must be met: the tape is adiskinit tape and the OPC firmware release is 3 or higher. To determine what firmware isrunning on the OPC perform the following:1) Configure Port B as a terminal2) Connect a VT100 to Port B3) Login to the OPC and shutdown the OPC using the Shutdown tool4) The very 1st line spooling on the screen will be:

    ROM Load Information:OPCRELXwhere X is the release version (03 and higher)

    To boot from tape:1) Configure Port B as a terminal2) Connect a VT100 to Port B3) Insert the diskinit tape into the OPC tapedrive4) Login to the OPC and shutdown the OPC using the Shutdown tool5) The OPC will now boot from the tape

  • 2-9 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 2 OPC Base Operations For Nortel Internal Use OnlyFor Nortel Internal Use Only

    6) The root password is toor. To access the disk maintenance menu, the userid is opc andthe password is cpo.

    2.2.2 OSI: Reconstructing an OPCs Serial NumberThere are several methods to regenerate an OPC serial number. In the below procedures,the command checksum is used, if checksum is unavailable guessing can be used todetermine the full serial number. The OPC Commissioning manager will display theOPCs serial number in the upper right hand corner of the screen. Use these proceduresonly if you do not have access to the Commissioning manager

    Reconstructing an OPC Serial Number using checksum:1) Login to the OPC, as root2) Enter checksum -l3) The serial number will be given

    or

    1) Login to the OPC, as root2) Enter uname -a3) A line will be displayed which will include the base serial number b17e####4) Enter checksum -o 3e####

    where #### are the last 4 numbers retrieved from the base serial number in step 35) The serial number will be given

    Note: The above information assumes you are using a socalled legacy OPC. The partitioned OPC uses serialnumbers with a slighty different format.

  • Chapter

    Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 3-1Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    3 OPC Hardware3.1 Problem DescriptionThe following sections describe common hardware problems. Each problem will bedescribed and possible diagnostic reasons will be provided. Available work-arounds willbe provided at the end of each problem reason description.

    3.1.1 H/W: ELAN Fail is LitThe EtherNet Fail light is on.

    Reason-1The EtherNet connection is broken or is not connected. Check the EtherNet cableand connection for continuity

    Reason-2The EtherNet port is not turned on. Use the /iws/lan/ether_admin tool toinitialize/enable the EtherNet port.

    See Telnet: Telnet Connection not running on page 1-9.

    Reason-3The LAN is having problems.

    See Telnet: Telnet Connection not running on page 1-9.

    Reason-4The EtherNet LOS is on but EtherNet is actually running. Sometimes the EtherNetLOS gets stuck. This can be remedied by shutting down the OPC through the OPCShutdown tool. Normally this will clear on its own, when the OPC detects trafficon the LAN.

    3.1.2 H/W: CNET FailThe Control NET (CNET) light is on or a CNET Fail alarm is raised.

    Reason-1The CNET connection between the OPC and NE shelf processor is beingcorrupted. Use the tstatc tool to determine the extent of the CNET problem.

    See CNET: Using tstatc to evaluate CNET on page 3-6.

  • 3-2 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 3 OPC Hardware For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-2There is a termination fault with the CNET connection. Assure that the properequipment is connected to the CNET ports and that all unused CNET ports areterminated with a proper CNET terminator.

    Reason-3The system contains more than 15 NEs and more than 5 of those NEs are on thesame CNET. This situation results in excessive messaging on the CNET andcauses the CNET hardware to go off-line. The work around is to identify theCNET path between the problem NE and the primary OPC. Then for all other NEsnot on that path, but connected to CNET, disable CNET until the problem NEcompletely recovers. Once the NE recovers then enable CNET on all the NEs onCNET.

    Reason-4There is a hardware fault with the CNET connection. Contact the appropriate OPCsupport authority.

    3.1.3 H/W: Active is not litThe Green Active light is off.

    Reason-1The OPC is presently being shutdown or is in the process of starting up after ashutdown. Wait five to 20 minutes for the OPC to complete the shutdown process.

    Reason-2The OPC is not powered up or is not seated properly in the shelf. Reseat the OPC.

    3.1.4 H/W: Unit Fail is litThe Unit Fail light is lit.

    Reason-1An OPC shutdown was performed with the halt option and the OPC was left in theshelf. If an OPC is halted and left in the shelf for a period of three minutes orgreater then the OPC will re-start. In order to alert the user that the OPC is startingup, the Unit Fail will light. While the Unit Fail is lit, do not pull out the OPC. TheOPC restart can be monitored by connecting a VT100 to Port B.

    Reason-2The OPC was improperly seated. Login to the OPC to verify that the OPC isfunctioning properly. Run the /etc/dmesg command to assure no hardware faults

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 3-3 Editor: Ross Brydon Issue: AD04

    Chapter 3 OPC HardwareFor Nortel Internal Use Only

    have occurred during re-start. If no reason for the Unit Fail can be found, shutdownthe OPC using the OPC Shutdown tool and reseat the OPC.

    Reason-3The OPC is unable to function as a result of a file system corruption.

    Reason-4There is a real hardware fault. Return the OPC to Nortel.

    3.1.5 TAPE: Amber Light is onThe amber light on the tape drive is on

    Reason-1The tape is rewinding or is in use by another application. Login to the OPC asroot and enter the who command to see who is logged into the OPC. The wall orwrite command can be used to send messages to other users asking to log off orclose tools.

    See USM: Using the wall and write commands on page 1-13.

    3.1.6 TAPE: Amber Light is Flashing RapidlyThe amber light on the tape drive is flashing rapidly. Use the /etc/dmesg command todetermine the tape drive problem.

    Reason-1Moisture has been detected in the tape drive. This will occur most often when theOPC has been brought in from a cold environment or if the environment is veryhumid. Wait up to 60 minutes for the OPC to dry out.

    Reason-2There has been a media fault. The tape drive needs to be cleaned.

    Reason-3There has been a real hardware fault. Return the OPC to Nortel.

    3.1.7 TAPE: Green Light is onThe green light on the OPC is lit.

  • 3-4 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 3 OPC Hardware For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-1A tape has been inserted into the tape drive. It is normal for the green light to be onwhile a tape is in the drive.

    3.1.8 TAPE: Green Light is Flashing SlowlyThe green light on the tape drive is flashing slowly. Use the /etc/dmesg command todetermine the tape drive problem.

    Reason-1There has been a media fault. The tape drive needs to be cleaned.

    Reason-2The tape inserted is generating excessive errors. The following can cause excessiveerrors: dirty tape drive, bad tape or kernel problems. Try to clean the tape drive, ifthe error persists, try a different tape. If the new tape is still causing excessiveerrors then there is a real hardware fault and the OPC must be returned to Nortel.

    3.1.9 TAPE: Green Light is Flashing Slowly and the Amber Light is onThe green light on the tape drive is flashing slowly and the amber light on the tape drive ison.

    Reason-1A pre-recorded audio tape has been inserted into the tape drive and it is beingautomatically played. Eject the tape and put in a proper OPC tape.

    3.1.10 TAPE: Green Light is Flashing RapidlyThe green light on the tape drive is flashing rapidly. Use the /etc/dmesg command todetermine the tape drive problem.

    Reason-1There has been a media fault. The tape drive needs to be cleaned.

    Reason-2The tape drive is having difficulty writing to the tape. The following can cause tapewrite problems: dirty tape drive, bad tape or kernel problems. Try to clean the tapedrive, if the error persists, try a different tape. If the new tape is still causing writeproblems then there is a real hardware fault and the OPC must be returned toNortel.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 3-5 Editor: Ross Brydon Issue: AD04

    Chapter 3 OPC HardwareFor Nortel Internal Use Only

    3.1.11 TAPE: Tape Wont EjectPushing the eject button wont eject the tape.

    Reason-1Eject mechanism is jammed. Shutdown the OPC using the OPC Shutdown tool.While the OPC is re-starting, press the eject button.

    3.1.12 TAPE: RBNCLEAN is Not RunningIf RBNCLEAN is not running there is a chance that if the heads require cleaning on thetape drive of the OPC it will not be detected. There should be events recorded in the /var/log/syslog file indicating that RBNCLEAN was run each day.

    Reason-1The rbnclean script is not available or not executable. Perform a ll /etc/drive_cleaning/rbnclean. If the script is not found, retreive a copy from anotherOPC or extract the file from another tape. If the permission are not set properlychange it with the chmod command.NOTE this script is only available on Legacy OPC.

    3.1.13 TAPE: Tape Drive Cleaning AlarmThe Primary or(and) Backup: tape drive cleaning required alarm(s) is(are) active on theOPC.

    Reason-1The heads require cleaning on the tape drive of the OPC.NOTE You must clear the alarm manually using the OPC Alarm Provisioningtool after you clean the tape drive.

    3.1.14 BAD DISK: SCANDISK/KLS is Not RunningIf SCANDISK or(and) the Kernel Login System (KLS) is(are) not running there is achance that a bad disk media will not be detected on the OPC. There should be eventsrecorded in the /var/log/syslog file indicating that SCANDISK was run each Wednesdayand that KLS was running.

    Reason-1The scandisk script is not available or not executable. Perform a ll /etc/scandisk. Ifthe script is not found, retreive a copy from another OPC or extract the file fromanother tape or flash card. If the permission are not set properly change it with thechmod command.

  • 3-6 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 3 OPC Hardware For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-1KLS is not running on the OPC. For KLS to be running commissioned the OPC.

    3.1.15 BAD DISK: Disk Bad Media AlarmThe Primary or(and) Backup: disk bad media detected alarm(s) is(are) active on the OPC.

    Reason-1There is a problem on the hard disk of the OPC. Reinitialize the OPC hard disk..NOTE You must clear the alarm manually using the OPC Alarm Provisioningtool after you repair the problem.

    3.1.16 HARDRIVE INDICATOR LIGHT: flashing

    Reason-1OPCs with a 1000 Mb. harddrive have a hardrive indicator light. If this light isflashing it indicates that the harddrive is being accessed. It therefore serves as awarning that the OPC should not be receded while the harddrive is in use as thismay cause potential hardware problems.

    3.2 Solution Description

    3.2.1 CNET: Using tstatc to evaluate CNETtstatc can perform a number of osi related functions. This procedure will indicate how tolocate the CNET socket information using tstatc.

    See Complete set of OPC Tools on page A-1.Using tstatc for CNET information:1) Login to OPC, as root2) Enter tstatc3) A menu of available commands will appear4) Enter c, for CLNP subnet statistics5) A display of CNET and LAN sockets will appear6) Enter t, for a full display of the CNET sockets information7) A display of complete CNET socket information will appear

  • Chapter

    Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 4-1Editor: Ross Brydon Issue: AD04

    For Nortel Internal Use Only

    4 NE Software Download4.1 Problem DescriptionThe following sections describe common NE Software Download problems. Eachproblem will be described and possible diagnostic reasons will be provided. Availablework-arounds will be provided at the end of each problem reason description.

    4.1.1 DLM: Reboot/Load Manager is not Downloading an NEThe NE is going through the reboot cycle but the NE is not being downloaded by theReboot/Load Manager.

    Reason-1The NE is not commissioned.Open the Commissioning Manager and assure that the NE has beencommissioned with the proper data.

    Reason-2The NE is commissioned with the wrong data.Open the Commissioning Manager and assure that the NE has beencommissioned with the proper data. If the NE has been commissioned with thewrong serial number, use the Commissioning Manager to change the serialnumber to the proper value.If the commissioned NE release does not match an NE load release then use the online command:lomui setNE_release -p -r -n to change the NE release.

    Reason-3The download request is not being received by the OPC.Open tstatc and determine if the dlm MSR is receiving incoming messages. If nomessages are being received then check for connectivity problems.

    See CAM: Associations are Down or are Unstable on page 2-4.See Complete set of OPC Tools on page A-1.

  • 4-2 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) Date: Aug. 11, 1998Issue: AD04 Editor: Ross Brydon

    Chapter 4 NE Software Download For Nortel Internal Use OnlyFor Nortel Internal Use Only

    Reason-4The NE network address is invalid or incomplete.The download request is beingreceived by the OPC, the download server (dls) is being created therefore thedownload is started but the packets are never received by the NE. As the result, thedownload files almost immediately.Using tstac tool verify that the dls has been created. Check what is the NE addressbeing served by the dls server.Ref PID Proc Ty Messages Dst. Osihost Name or Address. in out12 1358 dls DB 5350 5350 49+00003D3141B01B0402Logon to the NE, go to fa comm menu and obtain the NE address from there (forexample 49+00003D3141B01B0400). Compare NE addresses: one reported bytstatc and the other obtained from the NE. If they are identical (except last twodigits), contact the NE support group to verify what is going on on the NE and whythe NE is not accepting received packets. If the NE addresses are different or49+0000 area code is missing, correct or add the area code using areadaddrcommand on the NE (option 7 from fa comm menu).

    Reason-5The download request received by the OPC has the wrong serial number.Open the /var/log/syslog file and search for not in OPC. In the SWERR will bea representation of the NE serial number. It will have a format of 8 numbers.Reconstruct the NEs serial number and assure the serial number sent by the NEmatches the NE serial number commissioned. If the serial numbers are different, itmay be necessary to replace the shelf id card.

    Reason-6The download request received by the OPC has the wrong shelf type.Open the /var/log/syslog file and search for not in OPC. The SWERR will referto the shelf type. If the shelf type does not match the commissioned value, it maybe necessary to replace the shelf id card.

    Reason-7The SOC table does not contain all the NEs commissioned.Use the spock -u command to assure that the SOC table and database contain thesame information. The spock command can also be used to correct anymisalignment of the SOC table.

    Reason-8The software release of the NE is set to NONE.

  • Date: Aug. 11, 1998 OPC Trouble Shooting Guide (OC-48 Rel. 14.10, OC-3/OC-12 Rel. 13) 4-3 Editor: Ross Brydon Issue: AD04

    Chapter 4 NE Software DownloadFor Nortel Internal Use Only

    The NONE indicates that no releases have been found on the OPC. Use theinstall_release utility to install an NE load to the OPC. Commission appropriateNE using the Commissioning Manager tool.

    Reason-9The specified NE software load doesnt exist for the selected release.A log will be generated in the Event Browser indicating load not found. Assure the proper NE load exists in the /users/VFS/downloaddirectory. If the load does not exist in the directory, ftp the missing NE softwareload to the /users/VFS/download directory.

    Reason-10The catalogue file associated with the release that rebooting NE is assigned to doesnot exist in the /users/VFS/download directory.A log will be generated in the Event Browser indicating Cannot find Catalog for release . FTP missing catalogue file to the /users/VFS/download directory.

    Reason-11The catalogue file exists but refers to the different release than the rebooting NE.A log will be generated in the Event Browser indicating that Data mismatchbetween Catalog file occurred. Regenerate the catalogue file for the valid releaseand ftp it to the OPC.

    Reason-12The NE is being downloaded by the backup OPC.

    See DLM: Reboot/Load Manager indicates Fail on page 4-5.

    Reason-13The NE load is corrupt.

    See DLM: NE shelf processor firmware load is Corrupt orIncomplete on