Deploying 10/40G InfiniBand Applications over the WAN
Eric Dube ([email protected])
Senior Product Manager of Systems
November 2011
© 2011 Bay Microsystems, Inc. 11/30/2011 2
■ About Bay
• Founded in 2000 to provide high performance
networking solutions
– Silicon Engineering & Headquarters:
• San Jose, CA
– Systems Engineering & Business Development:
• Germantown, MD; MA
■ Corporate Focus
• Development of complex integrated circuits that are applied to high performance packet processing and optical transport applications in support of our systems
• Systems that deliver high performance protocol agnostic encryption adaptation, protocol inter-working, and WAN acceleration for government agencies and commercial enterprises
Overview
© 2011 Bay Microsystems, Inc. 11/30/2011 3
Common issues include:
■ Ability to maintain link utilization over extended
distances
■ Providing congestion control and avoiding packet loss
■ Fair link sharing of bandwidth resources across
multiple, concurrent applications
■ TCP/IP packet acknowledgement delays exponentially
grows as distance increase causing performance
degradation for most applications
Wide Area Networking Challenges
Wide Area Networks can often be difficult to deploy
for many popular compute and storage applications
© 2011 Bay Microsystems, Inc. 11/30/2011 4
■ Improved performance
• Enables application and storage acceleration
through RDMA
■ Increased efficiency
• Provides maximum link utilization over the WAN
with fair sharing of resources between applications
■ Minimal latency
• Adds very little latency to the overall WAN
connection for latency-sensitive applications
■ Cost savings
• Expands existing WAN link capacity and offloads
CPU for application processing (saving on both
hardware processing and network bandwidth)
■ Seamless implementation
• Transparent application interoperability with
existing and new applications and storage
solutions
Benefits of using InfiniBand / RDMA
for Wide Area Network Applications
© 2011 Bay Microsystems, Inc. 11/30/2011 5
0100110
Campus, Metro, or Wide Area Network
(from 1 to 1000’s of kilometers)
■ The need to extend InfiniBand between data centers is essential for providing
disaster recovery, multi-site backups, and real-time data access solutions.
■ While InfiniBand’s crediting mechanism is an excellent and reliable way to
provide flow control, existing InfiniBand LAN hardware doesn’t provide enough
port buffering for deployment beyond a single site.
• A reduction in sustained bandwidth starts occurring at 500-600 meters or less (depending on the data
rate) due to inadequate port buffering if the number of virtual lanes aren’t reduced.
• Even with the minimum amount of virtual lanes configured, not enough packets can be kept in-flight
on the wire due to the port buffer credit starvation that occurs over extended distances such as
greater than 4 kilometers
Challenges of Extending InfiniBand Globally
© 2011 Bay Microsystems, Inc. 11/30/2011 6
Welcome to IBEx WAN Acceleration Solutions
Highlights:
■ Improves link utilization with 80-99% efficiency
■ Supports 4X InfiniBand QDR connectivity today with future software upgradability to
FDR10 for up to 40Gbps data rates
■ Provides lossless communication and true QoS capabilities for workflows
■ Flexible 10/40G WAN connectivity options over SONET/SDH, ITU-T G.709 OTN, and
Ethernet
Intelligent Bandwidth Exchange
The IBEx product family enables wide area networking acceleration using RDMA
over InfiniBand for compute and storage applications to any point on the globe
(up to 15,000 km and beyond.)
© 2011 Bay Microsystems, Inc. 11/30/2011 7
■ The IBEx InfiniBand product family supports:
• All native InfiniBand protocols
– IPoIB, SDP, RDS, MPI, uDAPL, iSER, SRP, IB Verbs Layer, etc.
– Support RDMA data transfers over the WAN for applications
• Connectivity for all InfiniBand data rates
– SDR, DDR, QDR, FDR10*
• Standard QSFP InfiniBand Interface
– Accepts both active optical and passive copper cabling in addition to optical
transceivers
• Operates as a typical InfiniBand switch device
– Appears as a 2-Port Switch in the InfiniBand fabric
• True 10/40G WAN-side data rates for extending native InfiniBand
– Provides 10/40G actual data rate throughput for InfiniBand extension with QDR and
FDR10* connectivity
* Future support for FDR10 data rates through IBEx system software upgrade
IBEx InfiniBand Support
© 2011 Bay Microsystems, Inc. 11/30/2011 8
IBEx Platform
Typical Data Center Deployment Diagrams
Applications
Point-to-Point Campus / Metro Area Network Deployment
IBEx
Extension
Platform
Optical
Amplification
InfiniBand
LAN Switching
Infrastructure
Servers
Storage 10/40G Dark Fiber
Campus /
Metro Area
Network
Customer Premise Carrier / Local Fiber Network
optional
optional
1 - 250 km
DWDM Metro / Wide Area Network Deployment
IBEx
Extension
Platform
Optical
Amplification
DWDM Optical
Transport Platform
InfiniBand
LAN Switching
Infrastructure
10/40G Wavelength
Metro /
Wide Area
Network
Customer Premise Carrier Network
1 - 15,000+ km
Applications
Servers
Storage
© 2011 Bay Microsystems, Inc. 11/30/2011 9
Need for Distributed InfiniBand Applications
and Multi-site Deployments
• High resolution patient imaging sharing
between offices
Distributed
Healthcare
Applications
• Disaster recovery solution for low latency
trading and market data feed applications
Financial
Services
• Clustered applications and cloud computing
• Post-processing and visualization
High
Performance
Computing (HPC)
• High performance/high volume data sharing
and storage virtualization between sites Global File Systems
& Storage
• Multi-site failover and data mirroring
• Real-time local access and information
sharing
Clustered
Databases and
Warehouses
• Global distribution for thousands of HD
videos over a single connection
Content
Distribution
© 2011 Bay Microsystems, Inc. 11/30/2011 10
Network Protocol Efficiencies
■ Network Protocols:
• TCP/IP
– Typically software protocol stack implemented
– TCP subject to significant “saw-tooth” performance effects upon any loss with
slow ramp to nominal utilization
– Conversion to UDP, using TCP spoofing techniques, helps performance but
looses all notion of congestion management, reliable transport, and in-order
delivery
– TCP/IP utilization significantly degrades with multi-session and any notion of
congestion due to its reactive congestion control
• RDMA over InfiniBand (IB)
– RDMA (Remote Direct Memory Access) is hardware transfer initiated by the
software application from local memory, across the network, to the remote
server or mass storage system
– InfiniBand is lossless with end-to-end flow pro-active flow control and reliable,
in order, detection and delivery
– With InfiniBand extension, InfiniBand can run on nearly any optical or traffic
engineered network utilizing up to 90%+ efficiency of the available bandwidth
© 2011 Bay Microsystems, Inc. 11/30/2011 11
Typical
RDMA/IB
Performance
TESTS ON 1 Gbps CIRCUIT (~8000 miles)
[ ~13,000 fiber miles]
Typical
TCP/IP/ETHERNET
Performance RDMA over IB provides very efficient use of
available bandwidth with near linear scaling
RDMA/IB performance ≥ 80%
TCP/IP performance ≤ 40%
RDMA/IB CPU usage estimated 4x less
InfiniBand connection is lossless with nearly
perfect fair sharing of bandwidth across
multiple, concurrent data flows
TESTS ON 8 Gbps CIRCUIT (~1200 miles)
[~2000 fiber miles]
Typical
RDMA/IB
Performance
* Slide content and performance data obtained from Large Data JCTD Public Presentation
Large Data JCTD
Protocol Performance Comparison
© 2011 Bay Microsystems, Inc. 11/30/2011 12
■ Application testing performed in July/August 2011 at Brookhaven National
Laboratory as part of the ESnet ANI Testbed project
■ Obtained 96% efficiency of useable bandwidth through concurrent streaming of
RDMA applications
■ Utilized a SONET OC-768 (40G) WAN circuit spanning 370 km from Upton, NY
to Long Island and back
Orange / ESnet / Bay Microsystems
40G IB Extension over SONET OC-768
IBEx G40
370 km Fiber Loop
Infinera Optical
Transport Platform
Infinera Optical
Transport Platform
NY Long Island
Metro Area Network
SONET OC-768 Service
AOFA BNL
Applications,
Servers, & Storage
Applications,
Servers, & Storage
ESnet ANI Testbed
IBEx G40
© 2011 Bay Microsystems, Inc. 11/30/2011 13
Orange / ESnet / Bay Microsystems
ANI Testbed Performance Data
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
1 2 4 8 16 32 64 128 256 512 1024
MB
/sec
on
d
Transmit Queue Depth
Bidirectional Maximum Bandwidth (RC) Message Size
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
© 2011 Bay Microsystems, Inc. 11/30/2011 14
SC11: Orange / ESnet / Bay Microsystems
~7000 Fiber
Miles Loop!
World’s First Long Distance 40G RDMA over InfiniBand
Data Transfer Demonstration
Native 4X InfiniBand QDR is
extended over 40G Ethernet /
100G MPLS network circuit
provided by ESnet
Salt Lake
City
Chicago
Seattle
© 2011 Bay Microsystems, Inc. 11/30/2011 15
■ Remote Visualization Demonstration
• Visualization accessing remote data, leaving the dataset intact at the remote node
■ Uncompressed Parallel HD Video Streaming over distance
• Transfer parallel, independent streams consuming file wire bandwidth
■ High Performance “Big Data” File Transfers
• Demonstrate high bandwidth transfers over long-haul wide area networks
SC11: Booth Demonstrations
© 2011 Bay Microsystems, Inc. 11/30/2011 16
■ IBEx M40 Main Features • 40G InfiniBand extension platform providing connectivity for:
– 4X InfiniBand QDR / FDR10 (up to 40Gbps) [1 x QSFP], 10G Ethernet [2 x SFP+],
and 1G Ethernet [2 x SFP]
• Provides 40G WAN extension over:
– 40G Ethernet (40GBase-SR4/LR4), IPv4/IPv6, or dark fiber
• Enhanced internal port buffering and flow control capabilities enabling global InfiniBand
extension at full line rate from 1-15,000km
• Easy To Use Secure Graphical User [HTTPS] and Command Line Interface [SSH]
• Compact, low power (<150 watts), 1U 19-inch rack mountable chassis
– Redundant, hot-swappable (dual-input) power supplies and fans
IBEx M40 4X InfiniBand QDR / FDR10
Extension / 40G WAN Acceleration Platform
Serial Console
Management Ethernet
2 x 1G Ethernet
4X IB QDR
1 x 40G Ethernet
2 x 10G Ethernet
© 2011 Bay Microsystems, Inc. 11/30/2011 17
■ IBEx G40 Main Features • 40G InfiniBand extension platform providing connectivity for:
– 4X InfiniBand QDR / FDR10 (up to 40Gbps) [1 x QSFP], 10G Ethernet [2 x SFP+],
and 1G Ethernet [2 x SFP]
• Provides 40G WAN extension over:
– SONET OC-768/SDH STM-256, ITU-T G.709 OTU3, or dark fiber
• Enhanced internal port buffering and flow control capabilities enabling global InfiniBand
extension at full line rate from 1-15,000km
• Easy To Use Secure Graphical User [HTTPS] and Command Line Interface [SSH]
• Compact, low power (<150 watts), 1U 19-inch rack mountable chassis
– Redundant, hot-swappable (dual-input) power supplies and fans
IBEx G40 4X InfiniBand QDR / FDR10 Extension /
40G WAN Acceleration Platform
Serial Console
Management Ethernet
2 x 1G Ethernet
4X IB QDR
2 x 10G Ethernet
40G WAN (SONET OC-768/SDH STM-256, ITU-T G.709
OTU3, Dark Fiber)
© 2011 Bay Microsystems, Inc. 11/30/2011 18
IBEx M10/G10/M20/G20 4X InfiniBand QDR
Extension / 10G WAN Acceleration Platforms
Single 10G (actual data rate) InfiniBand over the WAN via
SONET OC-192/SDH STM-64, ITU-T G.709 OTU2, or 10G Ethernet
Dual 10G (actual data rate) InfiniBand over the WAN for
site-to-side link redundancy or multi-site connectivity configurations
© 2011 Bay Microsystems, Inc. 11/30/2011 19
For more information please contact:
Eric Dube
Senior Product Manager of Systems
Bay Microsystems, Inc.
Phone: (301) 944-8149
Email: [email protected]
http://www.baymicrosystems.com
© 2011 Bay Microsystems, Inc. 11/30/2011 20
IBEx G40 Platform
Connectivity Diagram
4X
In
fin
iBa
nd
QD
R
(QS
FP
Po
rt 1
)
10
G E
the
rne
t L
AN
(2
x S
FP
+
Tra
ns
ce
ive
r)
Serial Console (RJ45)
Management Ethernet (RJ45)
SONET OC-768/SDH STM-256,
ITU-T G.709 OTU3, WDM, or Dark Fiber
4X
In
fin
iBa
nd
QD
R
(QS
FP
Po
rt 1
)
10
G E
the
rne
t L
AN
(2
x S
FP
+ T
ran
sc
eiv
er)
40
G W
AN
(L
C F
ibe
r)
1G
Eth
ern
et
LA
N
(SF
P T
ran
sc
eiv
er)
1
G E
the
rne
t L
AN
(S
FP
Tra
ns
ce
ive
r)
4X InfiniBand QDR and
1/10G Ethernet LAN connections are
encapsulated over the 40G WAN link
Top Related