Industry Brief: Tectonic Shift - HPC Networks Converge

6

Click here to load reader

description

This Industry Brief covers the convergence of Infiniband and low-latency Ethernet for HPC.

Transcript of Industry Brief: Tectonic Shift - HPC Networks Converge

Page 1: Industry Brief: Tectonic Shift - HPC Networks Converge

Industry Brief Tectonic Shift:

HPC Networks Converge

Featuring

InfiniBand

Ethernet

Copyright 2013© IT Brand Pulse. All rights reserved. Document # INDUSTRY2012012 v2 February, 2013

Page 2: Industry Brief: Tectonic Shift - HPC Networks Converge

Document # INDUSTRY2012012 v2, February, 2013 Page 2

HPC Network Plate Tectonics

Ethernet Meets InfiniBand

High Performance Computing (HPC) is dominated by clusters of commodity servers connected by ultra-high-

performance networks. These HPC networks move massive amounts of data so fast that clustered computing and

storage systems can cost effectively replace huge monolithic supercomputers and storage systems.

Like the outer layer of the Earth, the market for HPC networks is broken up into tectonic plates. There are two major

plates—Ethernet and InfiniBand—and many minor plates. Convergent boundaries have occurred between Ethernet

and InfiniBand where the two major plates slowly slide towards each other forming a subduction zone, with one plate

moves underneath the other.

Enhancements to Ethernet, the bedrock of data center networking, will enable Ethernet to slowly drive InfiniBand out

from underneath the distributed applications which run on clustered servers and storage.

By 2016, annual sales of 40GbE and 100GbE products will reach

$3 billion—6x the annual sales of InfiniBand products in 2012. 40/100GbE

On the periphery of data centers lies a growing sea of distributed computing applications which run on clus-

tered servers and storage. Networking for HPC clusters will converge on higher bandwidth and lower latency

Ethernet. InfiniBand will be pushed out to niche HPC applications.

HPC Networking Plate Tectonics

Page 3: Industry Brief: Tectonic Shift - HPC Networks Converge

Document # INDUSTRY2012012 v2, February, 2013 Page 3

HPC Network Architecture

Scale-Out instead of Scale-up

HPC networks are found in Distributed Computing environments where individual servers are distributed on a

network, each of which has its own local memory, and communicating with each other by message passing.

Distributed Computing environments consist of low cost commodity servers configured in clusters to harness the

aggregate computing power of all the servers working together. The same concept is being applied to data storage

where storage systems are configured in a cluster to harness the aggregate IO power needed to move petabytes of

data.

IT organizations increase the

performance and capacity of

server and storage clusters

in their distributed

computing environment by

adding more nodes (scaling

out), versus adding more

processors to a single

computer (scaling up). What

enables clusters to harness

that compute power, and

scale to large numbers of

nodes, is ultra-high-

performance cluster

interconnects. These

networks have high-

bandwidth for moving

petabytes of shared storage for applications such as Seismic Analysis, or they are low-latency for applications such as

High Frequency Trading, where billions of calculations are distributed to the cluster for quick completion.

Highly distributed (Share Nothing) architectures are moving steadily towards mainstream business computing as a

result of the Google MapReduce and Google File System papers. The blue prints were used to establish an Apache

open-source framework called Hadoop which includes the MapReduce compute model and a distributed file system.

MapReduce divides applications into small pieces distributed on nodes in the cluster, and the file system provides high

bandwidth in a cluster. Oracle, SAP and other leading enterprise application vendors are offering data warehousing

and other applications based on Hadoop.

The amount of data processed or transferred in a given amount

of time measured in gigabits per second (Gbps). ‘Throughput’

and ‘Bandwidth’ are used interchangeably.

Bandwidth

Server busses are high-bandwidth and low-latency. For a cluster to provide application performance as

good as scaling “up” servers, the cluster interconnect must be high-bandwidth and low-latency.

Page 4: Industry Brief: Tectonic Shift - HPC Networks Converge

Document # INDUSTRY2012012 v2, February, 2013 Page 4

Enhancements to Ethernet

Close the Performance Gap

From the turn of the millennium until 2011, InfiniBand

speed jumped from 10Gb to 20Gb to 40Gb to 56Gb. At

the same time, Ethernet bandwidth increased from 1Gb

to 10Gb. As a result, InfiniBand networks were adopted

in an increasing number of high-performance cluster

applications.

In 2011 the Ethernet industry unveiled its first 40GbE

adapters and switches, as well as 100GbE switches,

immediately closing the performance gap with

InfiniBand for bandwidth intensive applications.

In the next few years when 4x EDR InfiniBand arrives

along with 100Gb Ethernet adapters, InfiniBand and

Ethernet will be at parity with end-to-end 100Gb

networks.

For low-latency applications, Ethernet has incorporated

Remote Direct Memory Access (RDMA)—direct memory

access from the memory of one computer into that of

another without involving operating systems—a

mechanism which allows large low-latency clusters with

InfiniBand.

With the market for distributed computing taking off,

Ethernet Chips from volume Ethernet NIC vendors will

include support for RDMA.

Latency is the time between the start and completion of one action

measured in microseconds (µs) . Latency

Special low-latency RDMA over Ethernet networks have been

available for several years. Volume Ethernet NIC vendors will be

offering support for RDMA in the next few years.

Only a few years ago, QDR InfiniBand had a 400% performance

advantage over 10GhE. Today, 56Gb FDR InfiniBand has a 40%

advantage over 40GbE. In a few years, 100GbE and 4x EDR

InfiniBand will be at parity.

104Gb 4x EDR InfiniBand 100GbE

Page 5: Industry Brief: Tectonic Shift - HPC Networks Converge

Document # INDUSTRY2012012 v2, February, 2013 Page 5

Building on Network Bedrock

The 99% of Installed LAN/Cluster Interconnect Ports

Short term, Ethernet-based HPC networking is just beginning to move from 10Gb technology to 40Gb technology

which significantly closes the performance gap with 56Gb FDR InfiniBand. As a result, inertia favors InfiniBand,

especially for applications with requirements for the highest bandwidth and lowest latency. This is highlighted by the

fact 50% of the Top 500 supercomputers now use InfiniBand and 20% of all HPC networks are InfiniBand networks.

Long term, the enhanced performance of Ethernet along with customer’s desire to protect their massive investment in

products and expertise, will push InfiniBand into niche applications. Over time, the vast majority of HPC architects and

administrators will build on Ethernet which comprises 99% of all network LAN and HPC ports, and the bedrock of their

data center networks.

The world’s 500 fastest supercomputers, ranked on their

performance using the LINPACK benchmark. Top 500

With 99% of combined LAN and HPC network ports installed in the world, Ethernet represents an

investment which IT organizations want to protect, as well as the mindshare of most networking

professionals.

LAN and Cluster Interconnect Ports Installed

IT Brand Pulse

Page 6: Industry Brief: Tectonic Shift - HPC Networks Converge

Document # INDUSTRY2012012 v2, February, 2013 Page 6

The Bottom Line

The Tectonic Shift has Started

The bottom line is this: The tectonic shift in HPC networking started with the availability of 40/100GbE and the

convergence of HPC networking onto Ethernet will halt the inertia of InfiniBand in a few short years. Annual sales of

40GbE and 100GbE products are expected to reach $3 billion in 2016, 6x the revenue of InfiniBand in 2012, and 3x of

projected InfiniBand product sales in 2016.

Related Links

To learn more about the companies, technologies, and products mentioned in this report, visit the following web

pages:

Emulex Network Xceleration (NX) Solutions Top 500 Supercomputer Sites InfiniBand Speed Road Map IT Brand Pulse Datacenter Acceleration

About the Author

Frank Berry is founder and senior analyst for IT Brand Pulse, a trusted source of data and analysis

about IT infrastructure, including servers, storage and networking. As former vice president of

product marketing and corporate marketing for QLogic, and vice president of worldwide marketing

for Quantum, Mr. Berry has over 30 years experience in the development and marketing of IT

infrastructure. If you have any questions or comments about this report, contact

[email protected].