High Availability Features in Intel Dialogic System Release 6.0

23
Intel in Communications High Availability Features in Intel ® Dialogic ® System Release 6.0 CompactPCI* for Windows* Application Note

Transcript of High Availability Features in Intel Dialogic System Release 6.0

Page 1: High Availability Features in Intel Dialogic System Release 6.0

Intel inCommunications

High Availability Features inIntel® Dialogic® System Release6.0 CompactPCI* for Windows*

Application Note

Page 2: High Availability Features in Intel Dialogic System Release 6.0

ContentsIntroduction 1

Peripheral Hot-Swap 1

Basic Hot-Swap 1

Full Hot-Swap 1

Redundant System Slot 1

Peripheral Redundancy 3

Software Architecture 3

Device Driver Interaction 4

RSS Software 5

RSS High Availability (HA) API 5

PHS Software 6

Hot-Swap Kit 6

Management Software 6

Fault Management 9

Alarm Management 11

Clock Management 11

Resource Management 11

Executing POST on Boards 13

CompactPCI* Platforms 16

Intel® NetStructure™ ZT5084 10U High Availability Platform 16

Intel® NetStructure™ ZT5985 12U Redundant Host Packet Switched 16Platform

Adding PHS Support 17

Adding Redundant System Slot Support 17

Sample Applications 17

For More Information 17

Appendix: Glossary 19

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows* Application Note

Page 3: High Availability Features in Intel Dialogic System Release 6.0

FiguresFigure 1: HA software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Figure 2: Example of registering for events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Figure 3: Example of fault detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Figure 4: Fault detection and recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Figure 5: Event services state machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Figure 6: Executing POST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Figure 7: Executing POST (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

TablesTable : RSS API functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Table 2: NCM API functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Table 3: SRL API functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Table 4: Event service API functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Table 5: Types of faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows* Application Note

Page 4: High Availability Features in Intel Dialogic System Release 6.0

IntroductionThis paper discusses the new features of Intel®

Dialogic® System Release 6.0 for CompactPCIthat enable users to build high availability (HA)into telecommunication systems. The HAfeatures included in the System Release are:■ Peripheral hot-swap (PHS)

■ Redundant system slot (RSS)

■ Peripheral redundancy

The System Release supports a range ofCompactPCI servers and single-boardcomputers. This paper discusses in detail theRSS on the Performance Technologies*ZT5084 platform and the Intel® NetStructure™ZT5550 single-board computer.

Peripheral Hot-Swap Peripheral hot-swap (PHS) for CompactPCIsystems is one of the most popular and cost-effective HA architectures. It allows for theonline repair, upgrade, or addition ofperipherals in a CompactPCI chassis withoutthe need to power down the system.Peripherals can be telephony boards, diskdrives, fans, power supplies, management andalarm modules, and more. PHS can have asignificant impact on reducing downtime, bothplanned and unplanned.

PHS, as defined by the PICMG 2.1 and PICMG2.12 specs, can be divided into two models:basic hot-swap and full hot-swap.

Basic Hot-SwapThe basic hot-swap model defines theparameters and attributes to insert or remove aperipheral device (e.g., a board) without causingany interrupts or activity on the PCI bus. Sincethe backplane needs to be passive for this zero-activity, some operator intervention is required atthe console to indicate to the operating systemthat a board needs to be removed or inserted.When instructed by the operator, the operatingsystem shuts down all active operations on theboard, thus making the board non-functional inthe system and safe for removal. If a board isbeing inserted, the new CompactPCI signal,ENUM#, informs the operating system (OS) thata board is requesting enumeration and allocationof resources. This model is the simplest and lessautomatic.

Full Hot-SwapThe full hot-swap model enhances the basichot-swap model by defining a method that-indicates to the OS that a board is beinginserted into or removed from the system. Thisis accomplished by a microswitch attached tothe IEEE 1101.10-compliant board that signalsthe OS that an operator is about to insert orremove a board. This microswitch is connectedto the handles of the board that are used toeither insert or remove the peripheral device.When the microswitch is tripped, theenumeration interrupt (ENUM#) indicates to theOS about the insertion or removal. Theoperating system then signals the operator, viaa blue LED on the face of the board, that it isacceptable to remove the board. If a boardwere being inserted, the OS wouldautomatically configure the board, thuseliminating the need for reconfiguring thesystem at the console. This model is morecomplex to implement, but does not requireany operator intervention.

Redundant System SlotRedundant system slot (RSS) systems providefor redundant, hot-swappable, single-boardcomputers (SBCs) in a CompactPCI system.Such a system builds on the capabilities of aperipheral hot-swap (PHS) CompactPCI systemby eliminating the SBC as a single point offailure.

An RSS platform can support different modesof operation such as Active-Standby mode andActive-Active mode. The two SBCs in the RSSplatform are installed on the same CompactPCIbackplane, which can be configured withsoftware to simultaneously or independentlycontrol two CompactPCI bus segments.

In the Active-Standby mode, there are twoSBCs in the RSS platform. However, only oneSBC is active at any time. This one SBC hascontrol of all the I/O Slots. The standby SBC isaware of — and to some extent synchronizedwith — operations on the active SBC andready for failover to occur. The standby SBCmonitors and takes over operations if a failureshould occur on the active SBC.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

1

Application Note

Page 5: High Availability Features in Intel Dialogic System Release 6.0

In the Active-Active mode (also called the Splitmode), each SBC controls one CompactPCIbus segment. Each SBC also acts as astandby for the bus segment it does notcontrol. In this model, both SBCs are able tocontribute resources. Customized software isprovided to take advantage of this model’sbenefits, which include fast failover into anActive-Standby state and load sharing andredundancy (both of which can be achieved inthis mode of operation). When an SBC fails,the second active SBC will takes overoperations of the failed one and continuesoperations on the peripherals managed by thefailed SBC.

There are two primary advantages of RSS:1. Elimination of the SBC as a single point

of failure, removing the need to duplicatecostly peripherals and extensive applicationchanges.

2. “Dark closet” operation, where there isno need to have an operator to add/removemalfunctioning peripheral devices.

Note that the RSS standard (PICMG 2.13) hasnot yet been ratified and many CompactPCIplatform vendors currently offer proprietary andnon-interoperable solutions. Thus, it isimportant to carefully evaluate the needs andrequirements of the system being built andchoose the right high-availability solution.

Application Note

2

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Telephony Application

Management Software

NCM API Fault Detector Event Service API

DeviceManagement

SoftwareRSSAPI

PnP Observer Win32 API

Plug and Play System

TelephonyDeviceDrivers

RSS DriversNet SwapDrivers

ApplicationProgramming

Interface

ChassisVendor

Microsoft*

Hot-Swap Kit

Intel

TelephonyHardware

SBC SBC

PCI PCI

High Availability Software Component Architecture

Figure 1: HA software architecture

Page 6: High Availability Features in Intel Dialogic System Release 6.0

Peripheral RedundancyWhile PHS is effective in reducing the time torepair, by itself it does not protect againstoperational downtime or the time needed toprocure a spare device and to dispatch atechnician to make the repair. To protectagainst operational downtime, redundancy ofperipherals (N+1 redundancy) is introduced.With peripheral redundancy, if a peripheralmalfunctions, the spare peripheral takes overoperations of the malfunctioning peripheralwithout operator intervention. A repairtechnician can then be dispatched, somewhatless urgently, to restore redundancy to thesystem. Not only does peripheral redundancy

enable the replacement of failed componentswith minimal downtime, it also allows forpreventative maintenance.

Software ArchitectureThe System Release provides all the softwarecomponents needed to build HA into telephonyapplications. The components include the hot-swap driver kit that is configured forspecific chassis, management and faultdetection software, and sample demonstrationapplications.

Figure 1 shows the architecture of the softwarecomponents. The System Release provides allthe components shown in the figure.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

3

Application Note

Scenario of Registering with the Event Notification Framework

Event NotificationFramework

The Application Fault Detector Telephony Driver Telephony Board

DlgAdmincnsumer::DlgAdminConsumer()

Create Observer forSpecified Channel

Specify Filtersand Channel

ImplementHandleEvent()

Callback

Enable Filters()Start Listening()

HandleEvent

getChannelName

Post Messagefor Dispatch

Send Message for Registering forSpecified Events

Network Alarm EventDetected

Figure 2: Example of registering events

Page 7: High Availability Features in Intel Dialogic System Release 6.0

The architecture is best explained by thefollowing scenarios. One shows how anapplication registers to receive notifications; asecond shows the interactions when a faultoccurs on a board.

Figure 2 shows the interactions between thecomponents used to register for alarm andfault notifications from peripheral boards. Hereis an outline of the application:■ The application creates an object of type

DlgAdminConsumer by invoking theconstructor of the class and passing in thechannel type (e.g. FAULT_CHANNEL).

■ This object then creates the necessaryconnections and sets up the communicationbetween the various software componentsinvolved in the transaction.

■ The application specifies filters for events it isinterested in monitoring.

■ A callback function HandleEvent() is imple-mented by the application invoked by theevent notification framework whenever aspecified event is observed.

■ When an event is observed, the applicationreceives all necessary information (i.e., theChannel name (FAULT_CHANNEL), the AUIDof the board) to perform any actions.

Figure 3 shows an example of an applicationreceiving an event when a DSP fault occurs ona peripheral board.

The main components in this transaction arethe application, the device driver for the board,and management software that includes theFault Detector and the Event Service alongwith the event notification framework. When aDSP on the telephony board fails, the firmwarerunning on the board notifies the controllerapplication by sending a message. Thiscontroller application is the Fault Detector thathas registered for various alarms and faults atthe time of initialization. When it receives thisfault notification, it queues an event to theEvent Service’s event notification framework.Eventually, the host application is notified ofthe event on the registered callback function.At that time, the application can act upon theevent and perform necessary operations oractions

Device Driver InteractionThe hot-swap system software residesbetween the operating system and hardwareand acts as a hot-plug system monitoringsoftware. The main tasks of this monitoringsoftware are to detect hot-swap events,identify the board’s required memory/interrupt

Application Note

4

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Scenario of DSP Fault Detected by an Application

Telephony DriverTelephony Board Fault Detector Event Service The Application

Reset BoardAnd Run POST

Notify with DLGC_EVT_SP_FAILURE

Post Message on FAULT_CHANNEL

Message (Fault Information)

Fault Observed

Figure 3. Example of fault detection

Page 8: High Availability Features in Intel Dialogic System Release 6.0

resources, and dynamically allocate them (ordeallocate them upon board removal). Todetect live insertion/removal of CompactPCIdevices from the bus, the hot-swap engine canuse one of the following methods:■ Polling the CompactPCI bus

■ Polling the enumeration interrupt (ENUM#)

When the hot-swap system software detectsthe ENUM signal, it informs the Windows 2000subsystem, specifically the Plug-n-PlayManager, of the detected event. The hot-swapsystem software has a well-defined, publishedinterface with the Windows 2000 operatingsystem (which supports plug/play events).Through this mechanism, the operating systemis made aware of the newly-inserted device.The Plug-n-Play Manager provides amechanism for device drivers and otherapplications to be notified when certain eventsoccur on a specific device or on the system ingeneral. These events include arrival anddeparture of device interfaces of the specifiedclass, and device removal requests. When anevent occurs, the Plug-n-Play Manager modulecalls the device driver’s “add” or “init” entrypoints to process the initialization of the devicedrivers after allocating resources (e.g., interrupt,memory) as needed for the device.

Similarly, when the device is removed, thePlug-n-Play Manager calls the registered“remove” entry points in the device driver tohandle the device removal requests and handlethe freeing of allocated resources.

RSS SoftwareThe RSS software is a separate package thatcan be installed before or after the SystemRelease software. Refer to the release guide fora list of chassis tested with the release and forother system requirements.

For information about installing RSS software,see Redundant System Slot Software for the

ZT5550 High Availability Processor BoardSoftware Manual (RSS_Software_Manual.pdf).This manual and the executable file that installsthe RSS software (ZRSS.exe) are located in therss directory on the System Release CD-ROM.A sample application is included in this pack-age that allows you to simulate a takeover or afailover of CPU cards. This sample applicationalso demonstrates the use of the RSS API thatis part of the SDK. The API allows you toprogram for:■ Fault configuration

■ Isolation strategies

■ Application notification

■ Remote diagnostics

The RSS software provided by the SystemRelease supports the PerformanceTechnologies* ZT5084 CompactPCI systemand the ZT5550 system master board (SBC).As mentioned earlier, this paper discusses thefeatures of the CompactPCI system and SBC.

RSS High Availability (HA) APIThe RSS HA API is discussed in the softwaremanual (RSS_Software_Manual.pdf) providedby Performance Technologies. Some of the APIfunctions are discussed in this section. For atelephony application to support RSS, it needsto register itself for notifications from the HAdrivers and software components of theprocessor board. The rssmanager sampleapplication included in System Release forWindows 2000 shows you how to use the APIsprovided.

A host application uses the API by includingthe CompactHA.h header file and linking to theCompactHA.lib library file. Necessary parame-ter constants and types are defined inCompactHACnst.h and CompactHATypes.h.These header files and libraries are installedwhen the RSS software is installed on yoursystem.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

5

Application Note

Page 9: High Availability Features in Intel Dialogic System Release 6.0

Table 1 lists some of the commonly used APIs

Application Note

6

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

API Description

HAConnect Connects the host application to the HA framework.

HADisconnect Terminates the connection with the host application.

HAConfigurationMode Sets the current host’s configuration mode. A host can only beplaced in the configuration mode if it is not the active one.

HAEnableNotification Enables the interrupt service routine to field the specified interrupttype. The callback function specified by the cbFunc parameterallows the host application to perform specific tasks based on theinterrupt occurring. The application may register to receivenotifications about faults and host state change by specifyingdifferent callback functions.

HADisableNotification Prevents the host-application-defined interrupt service routine fromfielding the specified interrupt types.

HAGetHostStatus Reports the current host system status. Statuses available for queryinclude system status and configuration information.

HAGetSlotID Retrieves the physical slot information for the calling host.

Table 1: RSS API functions

PHS SoftwareThe System Release software includes a hot-swap driver kit that can be configured forvarious chasses.

The telephony device drivers are configuredautomatically at install time for the specifiedchassis to perform the necessary hot-swapoperations.

Hot-Swap KitThe Hot-Swap Kit (HSK) is a CompactPCI hot-swap infrastructure product that supportsautomatic software connection and disconnec-tion when boards are hot-inserted or hot-extracted. HSK provides the functional devicedrivers that fully support the native DeviceDriver Model for Windows 2000. HSK was thefirst product for Windows 2000 to achieveGeneral Use Full Hot-Swap compliance asdefined by PICMG 2.1, the CompactPCI hot-swap specification.

With the HSK installed on your CompactPCIsystem, you can:

■ Insert and extract CompactPCI peripheralboards into and from a chassis with auto-matic software connection/disconnection ofthose boards and no rebooting.

■ Use native application notificationmechanisms to enable application monitor-ing of board insertions and removal requests.These notifications are delivered to theapplication via the Event Service API.

In conjunction with the HSK, the SystemRelease device drivers automatically configurethe PCI-to-PCI bridge windows forCompactPCI bus segments to ensure suffi-cient address space for hot insertions (sincethe BIOS-allocated windows are usually notsufficient). This is performed at driver initializa-tion time after the telephony boards have beendetected. Also, DCM, the configuration man-agement GUI, provides physical slot geogra-phy of your system and displays the physicalslot numbers, enabling operators to managethe systems better.

Management SoftwareThe System Release provides managementsoftware that allows you to configure andmonitor peripheral telephony devices. Faultdetection, repair, and isolation components arealso included.

Page 10: High Availability Features in Intel Dialogic System Release 6.0

The main components required to implementHA are:■ Fault management

■ Alarm management

■ Clock management

■ Peripheral resource management

The System Release includes the Event ServiceAPI and Event Notification Framework. The APIis used to register your application with theEvent Notification Framework. The Frameworkis the subsystem for all operations, administra-tion, maintenance, and provisioning (OAM&P)services to send asynchronous messages toregistered telephony applications. For detailedinformation on the Event Service API and theEvent Notification Framework, refer to the pro-gramming guides included in the release docu-

mentation. The framework contains variouschannels used to report a set of events thatcorrespond to solicited or unsolicited actions ofan operator.

Another library, the NCM API, provides an APIto manage and monitor operations on theperipheral telephony devices. This API allowsyou to retrieve board-level information such asthe physical slot ID, PCI Bus information, andCT Bus information. It also has functions tostart/stop/quiesce boards.

Tables 2, 3, and 4 list the functions that wouldbe used to develop an application to supportPHS and redundant system slot features of theSystem Release. The sample applicationsincluded in the System Release — rgademo,rssmanager and pfmanager — illustrate the useof these APIs.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

7

Application Note

API Description

NCM_IsHotSwapSystem Determine if the system has hot-swap capabilities.

NCM_GetHotSwapBoardCount Get the number of peripheral boards that currently are inthe hot-swap capable system.

NCM_GetValueEx Retrieve the value for a NCM database parameter.

NCM_SetValueEx Set a value to a NCM database parameter.

NCM_DeallocValue Release the memory allocated for the NCM databaseparameter.

NCM_GetFamilyDeviceByAUID Retrieve the family type given the AUID of the board.

NCM_GetInstalledFamilies Retrieve the family types of all installed boards.

NCM_GetInstalledDevices Retrieve the list of installed boards.

NCM_StartDlgSrv Start the Intel® Dialogic® system service.

NCM_GetDlgSrvState Retrieve the state of the service.

NCM_StopDlgSrv Stop the Intel Dialogic system service.

NCM_StartBoard Start a single board.

NCM_StopBoard Stop a single board.

NCM_RemoveBoard Remove a board from the NCM database.

NCM_GetDialogicDir Get a pointer to the System Release install folder.

Table 2: NCM API functions

Page 11: High Availability Features in Intel Dialogic System Release 6.0

API Description

SRLGetAllPhysicalBoards Get a list of boards currently installed in thesystem.

SRLGetVirtualBoardsOnPhysicalBoard Retrieve the number of virtual boards on aphysical board.

SRLGetSubDevicesOnVirtualBoard Retrieve the number of sub-devices on a virtualboard.

Application Note

8

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Table 3: SRL API functions

API Description

DlgAdminConsumer::DlgAdminConsumer( ) Allows you to instantiate a consumer object.Each DlgAdminConsumer object must beassociated with one event notificationchannel.

DlgAdminConsumer::DisableFilters( ) Disables a DlgAdminConsumer object’sarray of filters.

DlgAdminConsumer::EnableFilters( ) Enables a DlgAdminConsumer object’s arrayof filters.

DlgAdminConsumer::getChannelName( ) Returns the channel name that aDlgAdminConsumer object monitors forincoming events.

DlgAdminConsumer::getConsumerName( ) Returns the name of the DlgAdminConsumerobject. This name was associated with theconsumer object at time of instantiation.

DlgAdminConsumer:: StartListening( ) Allows the DlgAdminConsumer object tobegin monitoring its associated eventnotification channel for incoming events.

CEventHandlerAdaptor::HandleEvent( ) A virtual callback function invoked by theframework when an event is detected.

Table 4: Event Services API functions

Page 12: High Availability Features in Intel Dialogic System Release 6.0

Fault ManagementAny malfunction of a hardware device is termedas a fault. The Event Service has componentsthat are constantly monitoring the hardwaredevices by either polling or looking for aheartbeat signal. When the components detecta loss of the heartbeat signal, a fault isgenerated. Applications registered for the faultare notified appropriately.

There are two kinds of faults on any Intel®

NetStructure™ board: Control Processor faultsand Signal Processor faults. These faults aredetected by the device driver via a mechanismsetup between the device drivers and theindividual kernel on each board (firmware). The

device driver notifies the registered OAM&Pservices via function callbacks, which in-turnreport the event on the FAULT_CHANNEL. Thespecific events that are reported are:

■ DLGC_EVT_CP_FAILURE — Generatedwhen a control processor failure occurs onan Intel® NetStructure™ board.

■ DLGC_EVT_SP_FAILURE — Generatedwhen a signal processor failure occurs on anIntel NetStructure board.

Detection and RecoveryTable 5 lists relevant events and recoverymechanisms.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

9

Application Note

Type Why Fault is Generated Actions Application Should Take

DLGC_EVT_CP_FAILURE When the control processor Close all devices that are openthat runs the firmware on a on that physical board.board fails or asserts for anyreason. Restart POST on the board

and try to recover.

DLGC_EVT_SP_FAILURE If a physical board has voice Close all devices that are openmedia capabilities, then there on that physical board.are specialized DSPs thatcould fail due to a variety of Restart the board and try toreasons. If such a failure recover the channels on thatoccurs, then this event is board. reported to registeredapplications.

Table 5: Types of faults

Page 13: High Availability Features in Intel Dialogic System Release 6.0

The flowchart in Figure 4 shows appropriate actions that are performed when faults occur ona board.

Application Note

10

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Welcome tothe ATSC

Voice Portal

What do youwant to do...

Welcome tothe

Voice Portal

What do youwant to do...

Welcome tothe

Voice Portal

What do youwant to do...

NCM_StopBoard

Running

Decision State

NCM_StartBoard

Stop BoardRun POST

andDiagnostics

DLGC_EVT_CP_FAILUREDLGC_EVT_SP_FAILURE

Before stopping the board,you have to invoke thexx_close() on all thedevices that are open onthat physical board.

DLGC_EVT_BLADE_STOPPED

DLGC_EVT_BLADE_STARTED

PASS

Legend

Remove Board

FAIL

FAIL

Process

Figure 4. Fault detection and recovery

Page 14: High Availability Features in Intel Dialogic System Release 6.0

Alarm ManagementAlarms are disruptions that occur on both thecircuit and packet networks. On a circuitnetwork, an alarm could be the T-1/E-1 cablebeing disconnected, a loss of frame signal, etc.On the packet network, an alarm could be theEthernet cable being disconnected or a loss ofsignal with a router, etc. The Event Service, incollaboration with the Fault Detection Services,detects such alarms. The alarms on the circuitnetwork are reported to the application via theNETWORK_ALARM_CHANNEL. The alarms onthe packet (IP) network are reported to theapplication via the ENET_ALARM_CHANNEL.

Most of these alarms are also reported to a callcontrol application via the Global Call alarmnotification services, if the application hasenabled such services. Appropriate Global Callevents are generated into the SRL event queueand the application is notified. For example,when there is a RED alarm on the circuitconnected to a T-1/E-1 span, aGCEV_BLOCKED event is generated to theapplication. When the alarm is cleared, aGCEV_UNBLOCKED event is generated.

Clock ManagementThe CT Bus can be programmatically config-ured for various setups. An OAM&P service, CTBus Broker, monitors all activity of the CT Bus.When any failures occur the CT Bus broker,using the facilities of the Event Service, reportsthe events to registered applications on theCLOCK_EVENT_CHANNEL.

Types of alarms on the CT Bus include:■ DLGC_EVT_CT_A_LINESBAD — Occurs if

the signal on the CT Bus Line A fails.

■ DLGC_EVT_CT_B_LINESBAD — Occurs ifthe signal on the CT Bus Line B fails.

■ DLGC_EVT_LOSS_MASTER_SOURCE_INVALID — Signals that the source used bythe primary master board to drive the primaryline has failed. The primary master board canuse its own internal oscillator or a CT BusNetwork Reference line as its clock source.

■ DLGC_EVT_NETREF1_LINEBAD —Indicates that the signal on the CT BusNetRef 1 line has failed.

■ DLGC_EVT_NETREF2_LINEBAD —Indicates that the signal on the CT BusNetRef 2 line has failed.

Almost all of these events simply provideinformation to the application. The applicationdoes not need to take any action when one ofthese events is observed, since the OAM&Pservices handle the clock fallback and failovermechanisms when something goes wrong.

Resource ManagementResource management for the Intel®

NetStructure™ board is handled by theStandard Runtime Library (SRL) API and theEvent Service API using the Event NotificationFramework. Device enumeration and discoveryare done using the SRL and NCM APIs.

■ SRLGetVirtualBoardsOnPhysicalBoard()— Used to retrieve the number of virtualboards on a physical board identified by theAUID.

■ SRLGetSubDevicesOnVirtualBoard() —Used to retrieve the number of sub-deviceson a virtual board.

For example, suppose we have an Intel®

NetStructure™ DMN160TEC board in thesystem. This physical PSTN network board has16 T-1 or E-1 trunks, each represented by thedevice name dtiBn, where n represents a num-ber from 1 to 16. By invoking the SRL function,SRLGetVirtualBoardsOnPhysicalBoard(), wewould get 16 and the device type would beDTI. Using this information, we would build 16device names (i.e., dtiB1, dtiB2, and so on,through dtiB16). Also, by invoking the SRLfunction, SRLGetSubDevicesOnVirtualBoard(),we would learn how many timeslots exist oneach virtual board. If the DMN160TEC boardwere configured for a T-1 ISDN protocol, wewould get 23 timeslots. If it were configured foran E-1 ISDN protocol, we would get 30 as theoutput from the function.

The hardware device detection is done usingthe event notification framework and the EventService API. When a device is inserted onremoved from the system, the Plug-n-PlayObserver reports events to the Event Servicethat would deliver the same on theADMIN_CHANNEL to registered applications.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

11

Application Note

Page 15: High Availability Features in Intel Dialogic System Release 6.0

The various events that are reported include:■ DLGC_EVT_BLADE_ABOUT_TO_REMOVE

— Generated when the Device >Remove/Uninstall Device option is selectedin the Intel® Dialogic® Configuration Manager(DCM).

■ DLGC_EVT_BLADE_ABOUTTOSTART —Occurs when an individual board startcommand has been issued (either throughthe DCM’s Device > Start Device option orprogrammatically with the NCM_StartBoard( )function).

■ DLGC_EVT_BLADE_ABOUTTOSTOP —Occurs when an individual board stopcommand has been issued (either throughthe DCM Device > Stop Device option orprogrammatically with the NCM_StopBoard( )function).

■ DLGC_EVT_BLADE_DETECTED —Indicates that a newly-inserted board hasbeen detected by the System Releasesoftware and its initial configuration informa-tion has been stored in the NCM database.

■ DLGC_EVT_BLADE_REMOVED —Generated when a board has been removedfrom the system and its configurationinformation has been deleted from the NCMdatabase.

■ DLGC_EVT_BLADE_START_FAILED —Occurs if an individual board’s startsequence has failed. (The board startsequence can be initiated through DCM’sDevice > Start Device option or program-matically with the NCM_StartBoard( )function).

■ DLGC_EVT_BLADE_STARTED —Generated immediately after an individualboard has been successfully started. (Theboard start can be initiated through DCM’sDevice > Start Device option orprogrammatically with the NCM_StartBoard( )function).

■ DLGC_EVT_BLADE_STOPPED —Generated immediately after an individualboard has been successfully stopped. (Theboard stop can be initiated through DCM’sDevice > Stop Device option orprogrammatically with the NCM_StopBoard( )function).

■ DLGC_EVT_SYSTEM_ABOUTTOSTART— Occurs when a system start commandhas been issued (either through the DCM’sSystem > Start System option orprogrammatically with theNCM_StartDlgSrv() function).

■ DLGC_EVT_SYSTEM_ABOUTTOSTOP —Occurs when a system stop command hasbeen issued (either through the DCM’sSystem > Stop System option orprogrammatically with the NCM_StopDlgSrv( )function).

■ DLGC_EVT_SYSTEM_STARTED —Generated immediately after the system hasbeen successfully started. (The system startcan be initiated through DCM’s System >Start System option or programmaticallywith the NCM_StartDlgSrv( ) function).

■ DLGC_EVT_SYSTEM_STOPPED—Generated immediately after the system hasbeen successfully stopped. (The systemstop can be initiated through DCM’s System> Stop System option or programmaticallywith the NCM_StopDlgSrv( ) function).

The state chart in Figure 5 illustrates thevarious events listed above and shows a statemachine implemented in the PFMDemoapplication included in the System Release.

Application Note

12

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Page 16: High Availability Features in Intel Dialogic System Release 6.0

Figure 5: Event services state machine

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

13

Application Note

Peripheral Inserted

Decision State

NCM_StopBoard

This is the state in which the application is waiting to discover newperipheral boards. When th DLGC_EVT_BLADE_PHYS_INSERTEDevent is received by the Event Service, the application is notified.At this time, the application makes itself ready to handle this new board.

Legend

Peripheral Detected

Peripheral About to Start

Peripheral Started

Peripheral About to Stop

DLGC_EVT_SYSTEM_ABOUTTOSTOP

DLGC_EVT_BLADE_STOPPED Peripheral Stopped Peripheral Removed

What do youwant to do...

Run POST andDiagnostics

Peripheral Diagnose

DLGC_EVT_BLADE_DETECTED

NCM_StartBoard

FAIL

PASS

FAIL

When the application receives theDLGC_EVT_BLADE_STARTED event from the EventService, it needs to perform the device discoveryactions by calling the SRL functionsSRLGetVirtualBoardsOnPhysicalBoard() to retrievethe number of virtual boards on this physical board,and SRLGetSubDevicesOnVirtualBoard() to retrievethe number of sub-devices on each virtual board.After getting the necessary information theappropriate xx_open() functions need to be invoked.

DLGC_EVT_BLADE_PHYS_INSERTED

Executing POST on BoardsWhen a peripheral board is detected in thesystem, it is recommended that you executePOST on that device to ensure that thehardware is good and functional. The SystemRelease provides POST utilities that can beexecuted individually based on the type of

hardware and the family to which it belongs.Figure 6 illustrates how this can be done. Thiscode snippet is part of the pfmanager sampleapplication of the System Release.

The Diagnose() function shows you how toinvoke the DM3 and IPT post utilities.

Page 17: High Availability Features in Intel Dialogic System Release 6.0

The RunProgram() function in Figure 7, shows a way of creating a Windows process, then waitingfor that process to complete execution.

NCMRetCodeCDevice::Diagnose (){long lBoardId = 0;char szCmd[256];char szDir[128];DWORD dwRc;NCMRetCode ncmRet;int iLen = 128;

ncmRet = NCM_GetDialogicDir (“DNASDKDIR”, &iLen, szDir);if (ncmRet == NCM_SUCCESS){if (!strcmp(m_Family.name,”DM3”)){//run dm3 post silentlysprintf (szCmd,”%s\\Dm3Post -b%d -s%d”, szDir,m_PCIBus,

m_PCISlot);dwRc = RunProgram (szCmd);switch (dwRc){case 0:case DM38BITOK:case DM316BITOK:case DTI168BITOK:case DTI1616BITOK:ncmRet = NCM_SUCCESS;break;

default:ncmRet = NCME_GENERAL;

}}else if (!strcmp(m_Family.name, “IPT”)){//run ipt post.

.

.

.}else{ncmRet = NCME_GENERAL;

}}

return ncmRet;}

Application Note

14

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Figure 6. Executing POST

Page 18: High Availability Features in Intel Dialogic System Release 6.0

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

15

Application Note

DWORD

CDevice::RunProgram (char *lpCommandLine){STARTUPINFO si;PROCESS_INFORMATION pi;DWORD retcode;

si.cb = sizeof (STARTUPINFO);si.lpReserved = NULL;si.lpDesktop = NULL;si.lpTitle = NULL;si.dwFlags = 0;si.wShowWindow = SW_SHOW; //hide the programsi.cbReserved2 = 0;si.lpReserved2 = NULL;

pi.hProcess = NULL;pi.hThread = NULL;

if (!CreateProcess (NULL, lpCommandLine, NULL, NULL, false,CREATE_NEW_PROCESS_GROUP, NULL, NULL, &si, &pi))

{return -1;}

switch (WaitForSingleObject (pi.hProcess, 120000)){case WAIT_ABANDONED:case WAIT_TIMEOUT:return -1;break;

default:GetExitCodeProcess (pi.hProcess, &retcode);break;

}

if (pi.hProcess)CloseHandle (pi.hProcess);

if (pi.hThread)CloseHandle (pi.hThread);

return retcode;}

Figure 7. Executing POST (continued)

Page 19: High Availability Features in Intel Dialogic System Release 6.0

CompactPCI PlatformsThis section provides some details on theCompactPCI systems that have been used tovalidate the System Release software.

Note that the ZT5084 platform is fullysupported by the System Release software.However, there is limited support for theZT5085 platform in the form of PHS. RSS isnot supported. Some specific configurationsteps need to be performed when setting upthe System Release software in the ZT5085platform.

Intel® NetStructure™ ZT5084 10U HighAvailability PlatformThis HA CompactPCI platform provides acarrier-grade computing system fordemanding, mission-critical applications. TheZT5084 platform supports five-nines(99.999%) availability with built-in redundancyfor active system components includingsystem-slot CPU boards, power supplies, andalarming. These components are hot-swappable to simplify replacement andminimize service time.

The ZT5084 platform is ideal for telecomapplications requiring high system availability(e.g., enhanced services, media gateways,broadband access servers, or any criticalcomputing server platform destined for thecentral office). The hardware-based failoverand simplified HA driver model shortendevelopment time for telecom equipmentdesigners, while redundant system-slotarchitecture enables more efficient use ofexpensive I/O resources. This CompactPCIsystem has 12 slots available for peripheraldevices.

The ZT5550 High Availability Processor Boardis the only supported processor board for theZT5084 platform. This 6U, CompactPCI boardis designed for redundant processing configu-rations where very high levels of system avail-ability are required. Its architecture is tailoredfor demanding applications such as those ser-vicing the telecom network and the Internet.

The ZT5550 High Availability Processor Boardsupports up to 12 CompactPCI peripheralboards and, when used with a second ZT5550

board, can provide five-nines availability. Itfeatures the Intel‚ Pentium® III processor low-power module running at 500MHz, is hot-swappable, and includes several on-boardperipherals and optional I/O expansionfeatures. The board occupies one or two slots,depending on the configuration.

Intel® NetStructure™ ZT5085 12URedundant Host Packet SwitchedPlatformThe Intel® NetStructure™ ZT5085 12URedundant Host Packet Switched Platformfeatures a PICMG* 2.16-compatible mid-planesupporting redundant-host architecture for I/O-intensive applications. It is one of severalmodular telecom building blocks from Intel,providing OEM equipment designers with acarrier-grade, standards based, HA computingplatform for demanding mission-criticalapplications.

The Platform supports five-nines availabilitywith built-in redundancy for active systemcomponents including Ethernet switches,chassis management modules, powersupplies, and fan trays. Redundant chassismanagement modules make it possible tomanage multiple SBCs and conduct chassisdiagnostics remotely for enhanced systemreliability. Ethernet signals are routed acrossthe mid-plane without the use of cables,saving time in set-up, maintenance, and repair;and minimizing the thermal challenges oftraditional cabling methods.

The Platform is designed to interoperate withall members of the Intel® NetStructure™ familyof packet-switched products and with third-party boards meeting the PICMG 2.16specifications.

There are two supported processor boards forthe ZT5085 Platform, the Intel® NetStructure™ZT5504 and Intel® NetStructure™ ZT5524boards.

The ZT5504 processor board is a 2.16-compliant processor board that offersexcellent value with an optimized feature set tosupport a broad range of telecom and Internetapplications. Modular and standards-based,the ZT5504 supports efficient time-to-market

Application Note

16

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Page 20: High Availability Features in Intel Dialogic System Release 6.0

development. Completely validated with theIntel® NetStructure™ family of packet switchedbackplane (PSB) products, it is designed to beinteroperable with third-party 2.16-compliantcomponents. The board features a 1GHz, low-power Pentium III processor with 512Mbytes to 1 Gbyte ECC SDRAM.

The ZT5524 high-performance processorboard is a standards-based building blockdesigned for carrier-grade telecom and Internetapplications requiring exceptional processingpower and HA. The dual processor/redundanthost board is PICMG* 2.16-compliant andoffers configurable HA, I/O expansion, and66MHz CompactPCI bridging features. Agenerous set of on-board embedded featuresand reliable, off-the-shelf architecture addressthe integration and reliability requirements ofOEM systems builders. This board features asingle or dual 933MHz Pentium III processorthat supports symmetric multiprocessing insingle CompactPCI slot with a single 168-pin, right-angle DIMM module socket. It supportsup to 1GB PC133 SDRAM memory.

Adding PHS supportCertain changes are required to support PHS inan existing telephony application. The changescan be summarized into the following steps:1. Enumerate the peripheral device events to

which you want the application to listen. Forexample, you may only want events on theFAULT_CHANNEL and the ADMIN_CHANNEL.

2. Register the application to receive theperipheral device events via the EventService. This is done using the EventService APIs.

3. Build a state machine in your application(similar to the one shown in Figure 5) toproperly handle the various events.

Adding Redundant System SlotSupportAn existing telephony application needs tomake certain changes to support the RSSfeature. These can be summarized as follows:1. Identify the chassis vendor’s device driver

events. This should be available from thevendor’s documentation. The RSS Managersample application provided with theSystem Release illustrates the use of theZT5084 chassis and the ZT5550 SBCboards. It registers for specific events fromthe device driver provided by the chassisvendor.

2. Register the application to receive theevents on the ADMIN_CHANNEL of theEvent Service. This will enable you to moni-tor the activities of the Intel‚ NetStructure‘boards.

3. Register the application to receive eventsfrom the RSS HA framework provided bythe chassis vendor. This will enable theapplication to monitor the activities of theprocessor boards in the system.

4. Build a state machine into your applicationthat will perform appropriate actions whenthe Event Service reports the board levelevents.

Sample ApplicationsThe System Release software contains sampleapplications that demonstrate the use of theHA features supported. There are fourapplications:1. RSS Manager — Demonstrates the use of

the RSS HA APIs provided by the ZT5550system master board from PerformanceTechnologies. The application monitors theactivities on the SBC. When an active SBCgoes down (due to a friendly takeover,hostile takeover, or critical fault), theapplication is notified via a callback functionthat indicates that a takeover occurred andthat the standby SBC is now the currentactive host. At this time, the applicationperforms necessary peripheral operationson the Intel NetStructure boards.

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

17

Application Note

Page 21: High Availability Features in Intel Dialogic System Release 6.0

2. Peripheral Fault Manager (PFM) —Allows users to start/stop peripheral boardsinstalled in a system. It demonstrates theuse of Event Notification Framework eventsretrieved by using the Event Service API.

3. Revenue Generating Application(rgademo) — This call control applicationworks with the Peripheral Fault Managerapplication and the RSS Managerapplication. It also registers itself as aconsumer to the various events generatedby the Event Notification Framework. Itmonitors activities on all peripheral devicesinstalled in the system. When a telephonyperipheral is inserted into the system,appropriate events are detected by thisapplication. After the peripheral is initializedand started, ISDN calls are made in aloopback fashion for demonstration

purposes. Refer to the user’s guide for

more details on how to execute theapplication and set up the boards.

4. HA Demo — This demonstrationapplication, provided by PerformanceTechnologies, illustrates the use of the RSSHA APIs and the framework provided bythe ZT5550 system master board in theZT5084 platform.

The System Release contains a demo guide,“High Availability for Windows Demo Guide,”that describes these sample applications indetail.

For More InformationLearn more about Intel Dialogic SystemRelease 6.0 CompactPCI for Windows 2000and download the software athttp://www.intel.com/network/csp/products/8326web.htm.

Application Note

18

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

Page 22: High Availability Features in Intel Dialogic System Release 6.0

Appendix: GlossaryAPI Application programming interface

AUID Addressable unique identifier

DSP Digital signal processor

HA High availability

HSK Hot-Swap Kit

NCM Native configuration manager

OAM&P Operations, administration, maintenance, and provisioning

PHS Peripheral hot-swap

PICMG PCI Industrial Manufacturing Group

RSS Redundant system slot

SBC Single-board computer

High Availability Features in Intel® Dialogic® System Release 6.0 CompactPCI* for Windows*

19

Application Note

Page 23: High Availability Features in Intel Dialogic System Release 6.0

To learn more, visit our site on the World Wide Web at http://www.intel.com.

1515 Route TenParsippany, NJ 07054Phone: 1-973-993-3000

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS ORIMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT.EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OFINTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE,MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Intel products are not intended for use in medical, life saving, life sustaining applications.

Intel may make changes to specifications and product descriptions at any time, without notice.

Intel, Intel Dialogic, Intel NetStructure, Pentium, and the Intel logo are trademarks or registered trademarks of Intel Corporation orits subsidiaries in the United States and other countries.

*Other names and brands may be claimed as the property of others.

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximateperformance of Intel products as measured by those tests. Any difference in system hardware or software design or configurationmay affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems orcomponents they are considering purchasing. For more information on performance tests and on the performance of Intel products,reference http://www.intel.com/procs/perf/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104.

Printed in the USA Copyright © 2003 Intel Corporation All rights reserved. e Printed on recycled paper. 03/03 00-8549-001