Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled...

12
Flash Storage & 40GbE for OpenStack Kevin Deierling Vice President Mellanox Technologies kevind AT mellanox.com Flash Memory Summit 2014 Santa Clara, CA 1

Transcript of Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled...

Page 1: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

Flash Storage & 40GbE for OpenStack

Kevin Deierling

Vice President Mellanox Technologies

kevind AT mellanox.com

Flash Memory Summit 2014

Santa Clara, CA

1

Page 2: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

OpenStack Integration

SSD

Page 3: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

• Using OpenStack Built-in components and management – No additional software is required – RDMA is already inbox and used by OpenStack customers

• RDMA enables faster Flash performance & lower CPU consumption • More efficient network saves CapEx and OpEx to perform a given workload

Fastest OpenStack Storage Access

Hypervisor (KVM)

OS

VM

OS

VM

OS

VM

Adapter

Open-iSCSI w iSER

Compute Servers

Switching Fabric

iSCSI/iSER Target (tgt)

Adapter Local

SSD/HDD

RDMA Cache

Storage Servers

OpenStack (Cinder)

Using RDMA to accelerate iSCSI storage

Page 4: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

High-Performance Flash Storage Systems

40GbE or InfiniBand FDR Ports

8 x 56 Gb FDR IB / 40 GbE

From 3 TB to 48 TB (in 2U)

Up to 385 TB usable

1M Write IOPs, 2M Read IOPs

50us IO Latency

Gemini F600

Up to 1 Million IOPS

Latency as low as 100us

Up to 70TB of raw flash

Only 3 Rack Units (3U)

Mellanox

ConnectX

Up to 38 Terabyte (24 +24 SSDs)

24 Drives in 2U

40Gb/s iSCSI

EF540

Page 5: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

Primary Secondary

• Flash & iSER overcome IOPs bottlenecks in Virtual Desktop Infrastructure (VDI) – LSI Nytro MegaRAID accelerate disk access through SSD based caching

– Mellanox ConnectX®-3 10/40GbE Adapter with RDMA

– Accelerate Hypervisor access to fast shared Flash over 40G Ethernet

– Zero-overhead replication

• Unmatched VDI density of 150 VMs per server – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over

the exact same setup

Flash & 40GbE Enable 2.5x More VMs

iSCSI/RDMA (iSER) target

Software RAID (MD)

LSI Caching Flash/R

AID Controle

r

0 20 40 60 80 100 120 140 160

Intel 10GbE, iSCSI/TCP

ConnectX3 10GbE,…

ConnectX3 40GbE,…

Number of Virtual Desktop VMs

iSCSI/RDMA (iSER) target

LSI Caching Flash/R

AID Controle

r

Rep

lica

tio

n

Mellanox

SX1012

10/40GbE

Switch

Benchmark

Configuration

Redundant Storage

Cluster

• 2 x Xeon E5-2650

Mellanox

ConnectX®-3 Pro,

40GbE/RoCE

• LSI Nytro MegaRAID

NMR 8110-4i

2.5x More VMs

MSRP of

$25K

Page 6: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

CEPH and Networks

• High performance networks enable maximum cluster availability – Clients, OSD, Monitors and Metadata servers communicate over multiple network layers – Real-time requirements for heartbeat, replication, recovery and re-balancing

• Cluster (“backend”) network performance dictates cluster’s performance and scalability – “Network load between Ceph OSD Daemons easily dwarfs the network load between Ceph

Clients and the Ceph Storage Cluster” (Ceph Documentation)

Page 7: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

Deploy CEPH with 40GbE Interconnect

• Building Scalable, Performing Storage Solutions – Cluster @ 40Gb Etehrent – Clients @ 10G/40Gb Ethernet

• Directly connect over 500

Client Nodes – Target Retail Cost: US$350/1TB

• Scale Out Customers Use SSDs – For OSDs and Journals

Page 8: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

CEPH Performance with 40GbE

• Cluster (Private) Network @ 40GbE – Smooth HA, unblocked heartbeats, efficient data balancing

• Throughput Clients @ 40GbE – Guaranties line rate for high ingress/egress clients

• IOPs Clients @ 10GbE / 40GbE – 100K+ IOPs/Client @4K blocks

Throughput Testing results based on fio benchmark, 8m block, 20GB file,128 parallel jobs, RBD Kernel Driver with Linux Kernel 3.13.3 RHEL 6.3, Ceph 0.72.2

IOPs Testing results based on fio benchmark, 4k block, 20GB file,128 parallel jobs, RBD Kernel Driver with Linux Kernel 3.13.3 RHEL 6.3, Ceph 0.72.2

Cluster Network

Admin Node

40GbE

Public Network10GbE/40GBE

Ceph Nodes(Monitors, OSDs, MDS)

Client Nodes10GbE/40GbE

20X Higher Throughput

4X Higher IOPS with 40GbE clients

http://www.mellanox.com/related-docs/whitepapers/WP_Deploying_Ceph_over_High_Performance_Networks.pdf

Page 9: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

Hadoop with Ceph, Flash, & 40GbE • Increase Hadoop Cluster Performance • Accelerate Hadoop over Ceph

– Flash based OSD – High Performance Interconnect

• Dynamically Scale Compute and Storage • Eliminate Single Point of Failure

Name Node /Job Tracker Data Node

Ceph NodeCeph Node

Data Node Data Node

Ceph NodeAdmin Node

Page 10: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

CloudX Architecture Overview • Hardware

– Industry standard components • Servers, storage,

interconnect, software

– Mellanox 40GbE interconnect

– Solid State Drives (SSD)

• OpenStack components – Nova compute – Network controller node – Cinder/iSER & Ceph Storage – Mellanox OpenStack plug-ins

• Toolkit for automated switch and server deployment

Page 11: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

Thanks! Questions

Flash Memory Summit 2014

Santa Clara, CA

11

Page 12: Flash Storage & 40GbE for OpenStack - Flash Memory Summit · – Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with TCP/IP over the exact same setup ... Hadoop

Cinder Block Storage Integration

• Cinder (Target) • Volume Management

• Persistence independent of VM

• Nova (Initiator)

• Boot VM instance

• Attach volume to VM