Design and implementation Main features Socket API No need to modify existing...

1
Design and implementation Main features Socket API No need to modify existing applications/middleware Overlay network FW/NAT traversal (any node pair can connect to ea. other) High scalability (all node pairs can connect to ea. other) High performance with minimal configuration Optimizations based on latency measurements Only necessary configuration is on endpoint Bootserv The means by which libssocks and ssockds learn of each other (libssocks and ssockds must be configured with the endpoint of bootserv) Libssock Library component of SSOCK From the application’s point of view, calling libssock’s connect() establishes a directconnection A Scalable High-performance Communication Library for Wide-area Environments Hideo Saito Ken Hironaka Kenjiro Taura The University of Tokyo {h_saito, kenny, tau}@logos.ic.i.u-tokyo.ac.jp Overview Scalable Sockets (SSOCK) A communication library for parallel computation Solves the connectivity issues involved with WANs Achieves high scalability and performance Related work SOCKS (Leech et al. ‘96) UDP hole punching (Rosenberg et al. ’03) TCP splicing (Denis et al. ‘04) IPOP (Ganguly et al. ‘06) SmartSockets (Maassen et al. ‘07) DHTs Background Increase in the bandwidth of WANs SINET3 (JP), SURFnet (NL), etc. Realistic to perform parallel computation using multiple clusters connected via a WAN InTrigger (JP), DAS-3 (NL), Grid5000 (FR), etc. Need for a “smart” communication library that automatically adapts to the environment and offers: Connectivity Deal with FWs and NATs Scalability Need to limit the number of wide-area connections in order to avoid resource allocation limits Communication performance “Establishing connections whenever possible” can result in poor communication performance Meanwhile, mindlessly limiting the number of connections and forwarding messages does not work either Experimental results Experimental setup 1,264 cores in 11 clusters Connectiity and scalability Brought up a process on each of the 1,264 cores Socket library Connections could not be established between the two clusters behind NAT gateways Reached the limit on the number of file descriptors that each process could use Reached the limit on the number of connections that could be handled by a single NAT gateway SSOCK (w/ 1 ssockd per cluster) Connections could be established between all 1,264 processes Simultaneous connects Brought up the same number of processes in each cluster Every process tried to connect to every other process simultaneously using non-blocking connects The Socket library perfomed poorly because many SYN packets were dropped Point-to-point performance Future work A study on the effects of bringing up multiple ssockds per LAN Virtual connection Ssockd (≥1 per LAN) Bootserv (One per system) Overlay Network Libssock Real connection 0 0.5 1 1.5 2 2.5 3 0 200 400 600 800 1000 Socket SSOCK Message Size [MB] Bandwidth [Mbps] 0 5 10 15 20 0 20 40 60 80 Socket SSOCK Message Size [MB] Bandwidth [Mbps] 0 16 32 48 64 80 96 112 0 5 10 15 20 25 30 35 40 45 50 Socket SSOCK Number of Processes Completion Time [s] Cluste r Networ k Cluste r Networ k Chiba Global Kototo i Global Hiro Global Kyoto NAT Hongo Global Kyushu Global Imade NAT Mirai Global Istbs Global Okubo Global Keio Global Suzuk Global Kobe Firewa ll Intra-cluster (kototoi) ping-pong performance Inter-cluster (hongo-okub ping-pong performance

Transcript of Design and implementation Main features Socket API No need to modify existing...

Page 1: Design and implementation  Main features  Socket API  No need to modify existing applications/middleware  Overlay network  FW/NAT traversal.

Design and implementationMain features Socket API No need to modify existing applications/middleware Overlay network FW/NAT traversal (any node pair can connect to ea. other) High scalability (all node pairs can connect to ea. other) High performance with minimal configuration Optimizations based on latency measurements Only necessary configuration is on endpointBootserv The means by which libssocks and ssockds learn of each other (libssocks and ssockds must be configured with the endpoint of bootserv)Libssock Library component of SSOCK From the application’s point of view, calling libssock’s connect() establishes a directconnectionSsockd Daemon component of SSOCK Connects to other ssockds to construct an overlay (uses the overlay construction algorithm of MC-MPI (Saito et al. ‘07)) Forwards data (libssock -> ssockd, ssockd -> ssockd)

A Scalable High-performance Communication Library for Wide-area Environments

Hideo Saito Ken Hironaka Kenjiro TauraThe University of Tokyo

{h_saito, kenny, tau}@logos.ic.i.u-tokyo.ac.jp

OverviewScalable Sockets (SSOCK) A communication library for parallel computation Solves the connectivity issues involved with WANs Achieves high scalability and performance

Related workSOCKS (Leech et al. ‘96)UDP hole punching (Rosenberg et al. ’03)TCP splicing (Denis et al. ‘04)IPOP (Ganguly et al. ‘06)SmartSockets (Maassen et al. ‘07)DHTsBackground

Increase in the bandwidth of WANs SINET3 (JP), SURFnet (NL), etc.Realistic to perform parallel computation using multiple clusters connected via a WAN InTrigger (JP), DAS-3 (NL), Grid5000 (FR), etc.Need for a “smart” communication library that automatically adapts to the environment and offers: Connectivity Deal with FWs and NATs Scalability Need to limit the number of wide-area connections in order to avoid resource allocation limits Communication performance “Establishing connections whenever possible” can result in poor communication performance Meanwhile, mindlessly limiting the number of connections and forwarding messages does not work either

Experimental resultsExperimental setup 1,264 cores in 11 clusters

Connectiity and scalability Brought up a process on each of the 1,264 cores Socket library Connections could not be established between the two clusters behind NAT gateways Reached the limit on the number of file descriptors that each process could use Reached the limit on the number of connections that could be handled by a single NAT gateway SSOCK (w/ 1 ssockd per cluster) Connections could be established between all 1,264 processesSimultaneous connects Brought up the same number of processes in each cluster Every process tried to connect to every other process simultaneously using non-blocking connects The Socket library perfomed poorly because many SYN packets were droppedPoint-to-point performance

Future workA study on the effects of bringing up multiple ssockds per LAN

Virtualconnection

Ssockd(≥1 per LAN)

Bootserv(One per system)

OverlayNetwork

Libssock

Realconnection

0 0.5 1 1.5 2 2.5 30

200

400

600

800

1000

Socket

SSOCK

Message Size [MB]

Band

wid

th [M

bps]

0 5 10 15 200

20

40

60

80

Socket

SSOCK

Message Size [MB]

Band

wid

th [M

bps]

0 16 32 48 64 80 96 1120

5

10

15

20

25

30

35

40

45

50

Socket

SSOCK

Number of Processes

Com

pleti

on T

ime

[s]

Cluster Network Cluster Network

Chiba Global Kototoi Global

Hiro Global Kyoto NAT

Hongo Global Kyushu Global

Imade NAT Mirai Global

Istbs Global Okubo Global

Keio Global Suzuk Global

Kobe Firewall

Intra-cluster (kototoi) ping-pong performance

Inter-cluster (hongo-okubo) ping-pong performance