Comm Dev 2007

Meet us at booth 211

Developing SIPJonathan CummingDirector, VoIP Product ManagementData Connection (DCL)

3Agenda

Designing the right product Target markets and device characteristics

Getting to market efficiently Development choices Diagnosing problems

Designing for Scale SIP load balancing mechanisms

Designing for High-Availability Remote failures Local failures

Q & A

4Data Connection Ltd. (DCL)

NetworkingNetworkingProtocolsProtocolsDivisionDivision

Internet Internet ApplicationsApplications

DivisionDivision

Enterprise Enterprise ConnectivityConnectivity

DivisionDivisionProtocol software for OEMs

Communication application software for SPs

SNA software for OEMs and Enterprises

Class 4/5 softswitch solutions for IOCs/CLECs

300+ deployments3m subscriber capacity

5Market Segments

Which applications? Voice, Video, Instant Messaging, Presence, Gaming

Which SIP variant?

What type of customer and device? Carrier, Enterprise, Consumer

6Device Characteristics

What feature set? Endpoint vs. Proxy vs. B2BUA TCP, TLS, SCTP, SIGCOMP

What scale? Initial footprint vs. scalability

How reliable? Occasional reboot/crash acceptable 5x9s => Fault-tolerant architecture

How secure? What types of attack are likely? What is the risk and impact of DoS attack?

7Development Choices

Platform choices O/S Hardware

ACTA, Compact PCI, Network Processors, Multi-core HA Middleware

Increased off-the-shelf integration Ensure that all components meet your requirements

Make vs. Buy : Open source vs. Commercially licensed Advanced features

Scalability and High Availability Extensibility to support new features Comprehensive diagnostics

Guaranteed support to minimize total cost of ownership Timely enhancements to support new functionality Problem diagnosis and fixes for interoperability issues and bugs Help with application design

POSIX

8Inter-component tracingInter-component tracing

Event Logs Developer, Information, Warnings and ProblemsEvent Logs Developer, Information, Warnings and Problems

External Line Trace e.g. WireShark / EtherealExternal Line Trace e.g. WireShark / Ethereal

Trace of executionTrace of execution

Field UseDevelopment Environment

Diagnosing Problems

Wide range of issues Interoperability e.g. NAT Crashes i.e. service outage Performance QoS, DoS

Requires comprehensive diagnostics

Effective runtime filtering

FSM historyFSM history

Intra-component

Device-level

9Inter-Component Tracing

Details

Components

Chronology

Time Stamps

Separate Source & Destination Time Stamps

10

Designing for Scale

Faster processor May be limited by software

and hardware bottlenecks

Distribution of softwarecomponents

Requires modular software architecture Suitable for multi-card and SMP systems

Distribution to multiple devices Two distinct scenarios

Out of dialog requests Dialogs

One of SIPs keys strengths: applies equally to proxies and endpoints

Load-balancing is a specific form of distribution

CPU Utilization

05

101520253035

0 40 80 120 160 200 240

Calls per second

%

u

t

i

l

i

z

a

t

i

o

n B2BUAStateless ProxyTrx Stateful ProxyCall Stateful Proxy

11

Distribution Principles

Configuration Out of band mechanism

Entered by user DHCP

Redirection Initial response indicates nominated server

DNS SIP 3xx response

Proxy Initial request forwarded Creates path for future requests

Direct Via proxy

1

2

3

4

5

12 3

12

Registrations

Registration is a heavy load Soft-state: regular re-registrations for all devices Also used to maintain NAT/Firewall pinholes

Distributing initial REGISTER request Static configuration to use different registrars DNS Multicast

Distributing subsequent out-of-dialog requests First-hop security

Initial registration establishes secure tunnel to nominated server

Subsequent messages use this tunnel, overriding other routing

13

Dialogs

Distributing dialog-establishing requests Static configuration to use different proxies or servers Service-Route header

Returned on REGISTER response Causes all requests from a given endpoint to be routed via

nominated proxies. Redirection and Proxy

DNS and 3xx responses

Distributing in-dialog requests Contact and Record-Route headers

Returned on response to dialog-creating request Directs all requests within the dialog to be routed via nominated

proxies to the nominated server DNS use is limited to stateless devices

14

Example: External load balancer

Advantages Single external IP address Can also provide security services at network border (SBC) Can hide internal topology

Potential Pitfalls Bottleneck Single point of failure

IP load balancer Simple => cheap, fast Limited value, as breaks non-trivial flows, e.g. call transfer

SIP load balancer Pure SIP proxy provides limited security SBC function can break more complex flows

15

IP Routeror NAT

Example: Multi-card chassis

Advantages Single external IP address (optional) Supports additional cards without changes to external configuration Removes bottleneck

Distributor intelligently routes initial requests Distributor not on path of subsequent requests IP router or NAT distributes subsequent messages

Distributor does not need to be full SIP proxy If SIP software has modular, distributable architecture

Distributor(s)

Path of initial message

Path of subsequent messages

16

Designing for High Availability

Remote failures Detecting service availability Handling remote failures

Local failures Scope of effect Designing appropriate availability

17

Detecting Service Availability

Proxy EndpointRouter

Application

SIP

TCP/IP

Network

PING / ICMPIP Reachability

TCP keep alive

SIP Transport Response(100 Trying)

SIP Service RequestMay be implemented in several layers

18

Handling Remote Failures

Cannot determine failure scope from error responses Intelligence is distributed => cannot relate errors to topology

No mechanism to reduce load SIP specification causes cascaded failure on overload Work-in-progress: draft-ietf-sipping-overload-reqs-00

19

Alternate server 8 9

Existing calls

New calls

Service outage 8 8

Handling Local Failures

20

Handling Local Failures

Hot standby State replication enables failover without loss of stable calls New calls may use alternative server during failover In-service upgrade/downgrade for continuous operation

during maintenance

State replication

Hot standby 9 9Alternate

server 8 9

Existing calls

New calls

Service outage 8 8

21

Implementing Hot Standby

Management ComponentManagement Component

Management Plane

Data/Protocol Plane

DCLDCLSyste

mSyste

m

Manager

Manager

DCLDCL

Componen

tComp

onent

AA

DCLDCL

Componen

tComp

onent

BB

DCLDCLSyste

mSyste

m

Manager

Manager

DCLDCL

Componen

tComp

onent

AA

DCLDCL

Componen

tComp

onent

BB

Customer

Customerss

HWHWMana

gerMana

gerCusto

merCusto

merss

HW HW

Manager

Manager

Primary Line CardPrimary Line Card

Backup Line CardBackup Line Card

Keep alive et al

State and/or configuration replication

Active connections

Inactive connections

System Manager Creates backup process if required Initiates replication procedures Handles failovers

22

Signaling processing centralized Economies of scale by reducing number of centers

Media processing close to customer Shorter media path reduces latency Distribution reduces impact of single failure

Real-World Example

Local redundancy

Local distribution

Geographic distribution

Geographic redundancy

Call signalling

Media processing

23

Conclusions

SIP provides a very powerful architecture Many different uses and variants

Different solutions for different applications Different markets and devices characteristics Good support and diagnostics are key to success

High availability and scalability is a challenge Inherently complex area Cost vs. benefit trade-off

Come to talk to us about your requirements Our software is designed for scalability and High Availability Deployed and field-hardened around the world

Meet us at booth 211

[email protected]

Developing SIPAgendaData Connection Ltd. (DCL)Market SegmentsDevice CharacteristicsDevelopment ChoicesDiagnosing ProblemsInter-Component TracingDesigning for ScaleDistribution PrinciplesRegistrationsDialogsExample: External load balancerExample: Multi-card chassisDesigning for High AvailabilityDetecting Service AvailabilityHandling Remote FailuresHandling Local FailuresHandling Local FailuresImplementing Hot StandbyReal-World ExampleConclusionsQuestions?

Comm Dev 2007

Documents

Transcript of Comm Dev 2007