OpenVoIP An Open Peer-to-Peer VoIP and IM System

41
OpenVoIP An Open Peer-to-Peer VoIP and IM System Salman Abdul Baset, Gaurav Gupta, and Henning Schulzrinne Columbia University

description

OpenVoIP An Open Peer-to-Peer VoIP and IM System. Salman Abdul Baset, Gaurav Gupta, and Henning Schulzrinne Columbia University. Agenda. What is a peer-to-peer VoIP and IM system? Why P2P? Why not Skype or OpenDHT? Design challenges OpenVoIP architecture and design Implementation issues - PowerPoint PPT Presentation

Transcript of OpenVoIP An Open Peer-to-Peer VoIP and IM System

Page 1: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP An Open Peer-to-Peer VoIP and IM System

Salman Abdul Baset, Gaurav Gupta, and Henning Schulzrinne

Columbia University

Page 2: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Agenda

• What is a peer-to-peer VoIP and IM system?• Why P2P?• Why not Skype or OpenDHT?• Design challenges• OpenVoIP architecture and design• Implementation issues• Demo• Relay selection in P2P VoIP system• Performance monitoring of a P2P VoIP system

Page 3: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

A Peer-to-Peer VoIP and IM System

PSTN / Mobile

Establish media sessionIn the presence of NATs

Directory service

PSTN connectivity

Monitoring

P2P

{P2PPresence

P2P for all of these?

Page 4: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Why P2P?• Cost• Scale

– 10 million Skype online users (comscore)– 23 million MSN online users (comscore)

• Media session load– 100,000 calls per minute (1,666 calls per second)– 106 Mb/s (64 kb/s voice) 426 Mb/s (256 kb/s video)

• Presence load– 1000 notifications per second (500B per notification)– 4 Mb/s

• Monitoring load– Call minutes– Number of online users

Page 5: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Why not Skype?• Median call latency through a relay 96 ms (~6K calls)

– Two machines behind NAT in our lab (ping<1ms)

• Call success rate– 7.3 % when host cache deleted, call peers behind NAT

• 4.5K call attempts

– 74% when traffic blocked between call peers• 11K call attempts

• User annoyance– relays calls through a machine whose user needs bw!– Shut down the application resulting in call drop

• Closed and proprietary solution– plug P2P in existing SIP phones

Page 6: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Why not OpenDHT?

• Actively maintained?– 22 nodes as of Sep 7, 2008 [1]

• NAT traversal• Non-OpenDHT nodes cannot fully participate

in the overlay

[1] http://opendht.org/servers.txt

Page 7: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Design Challenges

the usual list…#1 Scalability#2 Reliablity#3 Robustness#4 Bootstrap#5 NAT traversal#6 Security

– data, storage, routing (hard)

#7 Management (monitoring)#8 Debugging

at bounded bw, cpu, mem / node(<500 B/s)}

must for any commercial p2p network}

Page 8: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Design Challenges

the not so usual list…#1 Scalability but how?

– Planet Lab has ~500 online machines online• ~400 in August

– beyond Planet Lab– which DHT or unstructured? any?

#2 Robustness?– a realistic churn model?

• at best Skype, p2p traces

#3 Maintenance?– OpenDHT only running on 22 nodes (Sep 7, 2008 [1])

#4 NAT traversal– Nodes behind NAT fully participating in the overlay

• May be, but at what cost?

[1] http://opendht.org/servers.txt

Page 9: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP• Design goals

– meet the challenges– distributed directory service

• Chord, Kademlia, Pastry, Gia– protocol vs. algorithm

• common protocol / encoding mechanisms– establish media session between peers [behind NAT]

• STUN / TURN / ICE– use of peers as relays– distributed monitoring / statistics gathering

• Implementation goals– multiplatform– pluggable with open source SIP phones– ease of debugging

• Performance goals– relay selection and performance monitoring mechanisms– beat Skype!

Page 10: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP architecture

SIP

P2P STUN

TLS / SSL

A peer in P2PSIP

NAT

A client

[email protected]@example.com

[ Bootstrap / authentication ]

Overlay1

Overlay2

Protocol stack of a peer

NAT

[ monitoring server / Google Maps ]

Page 11: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Peer-to-Peer Protocol (P2PP)

• A binary protocol• Geared towards IP telephony but equally applicable

to file sharing, streaming, and p2p-VoD• Multiple DHT and unstructured p2p protocol support• Application API• NAT traversal

– using STUN, TURN and ICE• Request routing

– recursive, iterative, parallel– per message

• Supports hierarchy (super nodes [peers], ordinary nodes [clients])

• Central entities (e.g., authentication server)

Page 12: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Peer-to-Peer Protocol (P2PP)

• Reliable or unreliable transport (TCP/TLS or UDP/DTLS)

• Security– DTLS, TLS, storage security

• Multiple hash function support– SHA1, SHA256, MD4, MD5

• Monitoring– ewma_bytes_sent [rcvd], CPU utilization, routing

table

Page 13: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP features

• Kademlia, Bamboo, Chord• SHA1, SHA256, MD5, MD4• Hash base: multiple of 2• Recursive and iterative routing• Windows XP / Vista, Linux

• Integrated with OpenWengo• Can connect to OpenWengo and P2PP network• Buddy lists and IM

• 1000 node Planet lab network on ~300 machines• Integrated with Google maps

Demo video: http://youtube.com/?v=g-3_p3sp2MY

Page 14: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP snapshots

call through a relaycall through a NATdirect

Page 15: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP snapshots

• Google Map interface

Page 16: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP snapshots

• Tracing lookup request on Google Maps

Page 17: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP snapshots

Page 18: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP snapshots

• Resource consumption of a node

Page 19: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Why calls may fail in OpenVoIP?

• Cannot find a user– user is online, but p2p cannot find it.

• NAT and firewall issues– SIP messages – call succeeds but media?– relay

• Relay is shutdown

System reliability – (search + NAT traversal + relay)

Page 20: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Facts of Peer-to-Peer Life

• Routing loops happen• Byzantine failures arise• Nodes become disconnected• System does not always scale!• Automated maintenance does not always

work• Planet Lab quirks

– cleans the directory– DoS attacks on open ports

• Bootstrap server is attacked

Page 21: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP: Key techniques

• Randomization is our best friend!– send the maintenance messages within a

bounded random time

• Churn recovery– is on demand and periodic

• Insert a new entry in routing table after checking liveness

• Periodically republish SIP records– not feasible for large records

• Avoid overly complex mechanisms – can backfire!

Page 22: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP: Debugging

• Black-box– Lookup request for a random key

• State acquisition– Remotely obtain the resource and storage utilization of a

node• Set and Unset a data-value on a node

– such as BW, CPU utilization– to test a relay selection algorithm

• Remotely enable and disable logging• Control log size• Find a faulty node

– hard– centralized vs. distributed approach

Page 23: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP – releasing an update

Three step process1) Check in a local network (10-15 nodes)

2) Deploy the update on a managed node that fully participates in the overlay– test its functionality

3) Release the update

• Planet Lab deployment– churn one quarter of the network– deploy the update– continue until done

Page 24: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

OpenVoIP: Bootstrap

• Returns a list of twenty nodes if available• Recently joined nodes and some managed

nodes

Page 25: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Thank you.

Page 26: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

NAT traversal

SIP server

STUN / TURN server

SIPMedia

P2PP

SIP DB

Page 27: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

NAT traversal

• Solution space– Tunnel SIP and RTP within P2PP– Tunnel SIP within P2PP– NAT traversal for P2PP, SIP, RTP

• tunnel within STUN, multiplexing• different ports, same port

Page 28: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Implementation issues

• Routing table maintenance– hash table– insert a new entry after a ‘keep-alive’– max entries per row (currently 5)– proximity neighbor selection [disabled]

• Churn recovery– send keep-alive to nodes after a random time– on demand– get routing table of randomly selected node

• Bootstrap– bootstrap server and 20 bootstrap peers– returns recently joined nodes and some bootstrap

nodes

x+2i

x+2i+1

x+2i+2

x+2i+3

Routing table

Page 29: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Implementation design

Transport / timers

Node

BigInt

Parser / encoder

UDP TCP

Transactions

ClientBootstrap KadPeer BambooPeer OtherPeer

Sys

insert (key, value, callback)callback (resp)

lookup (key, callback)

Routing table

Neighbor table

Distance

DTLS TLS

{multiplatform

app. pluggability} {

Page 30: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Implementation issues

• Request routing– recursive

• per message state

– iterative– loop detection

• iterative [machine]• recursive [using message state]

• Replication vs. republish– periodically republish [30s – 1 minute]– [pro] learn about the topology– [con] republishing large data incurs bw overhead

• Logging– log mechanism

Page 31: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Implementation issues• Diagnostics

– protocol– command-line

• showrt, shownt, showro, showcp, • insert [key] [value], rlookup, ulookup• getrt getnt getro [IPaddr] [port]

– graphical

• Platform independence– thread: 3 functions

• createthread, waitforthread [pthread_join],

– sys: 3 functions• strcasecmp, getopt, gettimeofday (GetSystemTimeAsFileTime)

– net: 4 functions• close [closesocket], inet_aton [inet_addr], select timer, getsockopt

Page 32: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

JoinJP BS P5 P7

1. Bootstrap

2. 200

P5, P30, P2P-Options

4. Join

9. 200

N(P9, P15)

5. Join

7. 200

P9

JP(P10)

8. Join

6. 200

N(P9, P15)

10. PublishObject

11. 200

3+. STUN (ICE candidate gathering)

BS=bootstrap server

Page 33: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Call establishmentP1 P3 P5 P7

1. LookupObject (P7)

5. 200 (P7 PeerInfo)

2. LookupObject (P7) 3. LookupObject (P7)

4. 200 (P7 PeerInfo)

6. 200 (P7 PeerInfo)

7. INVITE

8. 200 Ok

9. ACK

Media

Page 34: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Chord

Neighbor table

Node

x+2i

x+2i+1

x+2i+2

x+2i+3

id=x

Routing table

Any node inthe interval

Page 35: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Kademlia(XOR)

Node

2i

2i+1

2i+2

2i+3

id=x

Routing table

No neighbor table

Page 36: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Chord – recursive

Neighbor table

Node

x+2i

x+2i+1

x+2i+2

x+2i+3

id=x

Routing table

Page 37: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Chord – iterative

Neighbor table

Node

x+2i

x+2i+1

x+2i+2

x+2i+3

id=x

Routing table

Page 38: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Relay selection

• Using peers as relays

• Peer acting as relay– can preallocate fix number of calls

• Skype one voice/video call per relay

– can preallocate resources• CPU, bw

– as long as user of relay machine is not ‘annoyed’• what does annoy mean?

Page 39: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Relay selection

• Annoyance function af()– threshold based af() < threshold, use as a relay– real-value

– Input parameters• CPU utilization, interactivity, bytes sent/rcvd

• Relay selection approach– constraint: RTT, loss rate, uptime– select a relay set– load-balance approach– annoyance function approach

Page 40: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Relay selection algorithm

• Routing table based– call load to number of relays in routing table

• AS number based– select a relay within same AS– but too many machines in one AS– or none …

• IP prefix based• Random

Page 41: OpenVoIP  An Open Peer-to-Peer VoIP and IM System

Relay selection algorithm

• Churn– what happens when a relay goes down?– active vs. passive approach

• active: send redundant traffic through alternate relays• passive: detect failure and then switch

– different relays for media traversing in each direction

• For 18% calls (18K total) Skype use a different relay from caller to callee and vice versa