Design and implementation Main features Socket API No need to modify existing...
-
Upload
branden-thompson -
Category
Documents
-
view
215 -
download
1
Transcript of Design and implementation Main features Socket API No need to modify existing...
Design and implementationMain features Socket API No need to modify existing applications/middleware Overlay network FW/NAT traversal (any node pair can connect to ea. other) High scalability (all node pairs can connect to ea. other) High performance with minimal configuration Optimizations based on latency measurements Only necessary configuration is on endpointBootserv The means by which libssocks and ssockds learn of each other (libssocks and ssockds must be configured with the endpoint of bootserv)Libssock Library component of SSOCK From the application’s point of view, calling libssock’s connect() establishes a directconnectionSsockd Daemon component of SSOCK Connects to other ssockds to construct an overlay (uses the overlay construction algorithm of MC-MPI (Saito et al. ‘07)) Forwards data (libssock -> ssockd, ssockd -> ssockd)
A Scalable High-performance Communication Library for Wide-area Environments
Hideo Saito Ken Hironaka Kenjiro TauraThe University of Tokyo
{h_saito, kenny, tau}@logos.ic.i.u-tokyo.ac.jp
OverviewScalable Sockets (SSOCK) A communication library for parallel computation Solves the connectivity issues involved with WANs Achieves high scalability and performance
Related workSOCKS (Leech et al. ‘96)UDP hole punching (Rosenberg et al. ’03)TCP splicing (Denis et al. ‘04)IPOP (Ganguly et al. ‘06)SmartSockets (Maassen et al. ‘07)DHTsBackground
Increase in the bandwidth of WANs SINET3 (JP), SURFnet (NL), etc.Realistic to perform parallel computation using multiple clusters connected via a WAN InTrigger (JP), DAS-3 (NL), Grid5000 (FR), etc.Need for a “smart” communication library that automatically adapts to the environment and offers: Connectivity Deal with FWs and NATs Scalability Need to limit the number of wide-area connections in order to avoid resource allocation limits Communication performance “Establishing connections whenever possible” can result in poor communication performance Meanwhile, mindlessly limiting the number of connections and forwarding messages does not work either
Experimental resultsExperimental setup 1,264 cores in 11 clusters
Connectiity and scalability Brought up a process on each of the 1,264 cores Socket library Connections could not be established between the two clusters behind NAT gateways Reached the limit on the number of file descriptors that each process could use Reached the limit on the number of connections that could be handled by a single NAT gateway SSOCK (w/ 1 ssockd per cluster) Connections could be established between all 1,264 processesSimultaneous connects Brought up the same number of processes in each cluster Every process tried to connect to every other process simultaneously using non-blocking connects The Socket library perfomed poorly because many SYN packets were droppedPoint-to-point performance
Future workA study on the effects of bringing up multiple ssockds per LAN
Virtualconnection
Ssockd(≥1 per LAN)
Bootserv(One per system)
OverlayNetwork
Libssock
Realconnection
0 0.5 1 1.5 2 2.5 30
200
400
600
800
1000
Socket
SSOCK
Message Size [MB]
Band
wid
th [M
bps]
0 5 10 15 200
20
40
60
80
Socket
SSOCK
Message Size [MB]
Band
wid
th [M
bps]
0 16 32 48 64 80 96 1120
5
10
15
20
25
30
35
40
45
50
Socket
SSOCK
Number of Processes
Com
pleti
on T
ime
[s]
Cluster Network Cluster Network
Chiba Global Kototoi Global
Hiro Global Kyoto NAT
Hongo Global Kyushu Global
Imade NAT Mirai Global
Istbs Global Okubo Global
Keio Global Suzuk Global
Kobe Firewall
Intra-cluster (kototoi) ping-pong performance
Inter-cluster (hongo-okubo) ping-pong performance