Emc

15
1 Advanced Storage Technologies for High Performance Computing Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA

description

rep

Transcript of Emc

Page 1: Emc

1

Advanced Storage Technologiesfor

High Performance Computing

Sorin, FaibishEMC NAS Senior TechnologistIDC HPC User Forum, April 14-16, Norfolk, VA

Page 2: Emc

2

IDC HPC User Forum 2008

New HPC Storage Intensive Applications

Storage Challenges* New algorithms that can scale to search and process massive datasets; New metadata management of distributed data sources; New platforms provide uniform high-speed memory access to multi terabyte

data structures; Hybrid interconnect architectures to process and filter multi gigabyte data

streams from scientific instruments; High-performance, high-reliability, petascale distributed file systems; New approaches to software mobility, so that algorithms can execute on

nodes where the data resides; Flexible and high-performance software integration technologies running on

diverse computing platforms; Data signature generation techniques for data reduction and rapid

processing.

*Computer Magazine: http://www.computer.org/portal/cms_docs_computer/computer/homepage/0408/R4gei.pdf

Page 3: Emc

3

IDC HPC User Forum 2008

New Storage Technologies for HPC

Storage Technologies Virtualization to address the multi-

core problem CDP and memory snapshots to

address storage failures during computation

DR and distributed cache appliances to address computation across geographies

SSD disk technology to address Data Intensive Super Computing tasks as well as decrease power consumption of storage

pNFS and RDMA technologies to increase the I/O speeds and reduce computation cycles

Storage at Previous HPC User Forum

Page 4: Emc

4

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores Current Implementation

– Application split on multiple single core SMP HW

– Use middleware SW (Platform)

Page 5: Emc

5

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Dual-core support added– Application modified to support SMP

dual core– CPU used: 4x 100% (100%)– Licenses paid: 4– Licenses used: 4

Page 6: Emc

6

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Quad-core chips appear– CPU used: 4x 100% (4/8=50%)– Licenses paid: 8– Licenses used: 4 – Application must be modified or

Page 7: Emc

7

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Quad-core chips appear– CPU used: 4x 100% (50%)– Licenses paid: 8– Licenses used: 4 – Application must be modified or– Use VM with CPU affinity– CPU used: 8x80% (80%)– Licenses used: 8

Page 8: Emc

8

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

N-cores chips are coming– Use VM with VT support– CPU used: 2xNx90% (90%)– Licenses paid=used: 2xN

Page 9: Emc

9

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Core agnostic Middleware will work with as many cores as available

– Enabled by pNFS access to shared storage

Page 10: Emc

10

IDC HPC User Forum 2008

CDP + Memory Snapshots in HPC applications

SAN

SunIBM HPHDSEMC

HPC Application platform support

CDP Journal + Memory Snapshots

CDPAppliance

CDP Technology will work with Real and Virtual Infrastructures

– VM snapshots on central storage repository

– VM and HW hosts memory snapshots

– Any SAN or NAS storage– Recover HPC job at any

point in time (last minute failure after 2 weeks run)

Page 11: Emc

11

IDC HPC User Forum 2008

HPC Application remote platform

HPC Application platform support

Continuous Remote Replication in HPC

Site A Site BSANSAN

SunIBM HPHDSEMC

SunIBM HPHDSEMC Heterogeneous

storage

CacheAppliance

CacheAppliance

HeterogeneousBlades; VM+HW

Distributed cache engines allow distributed access to shared storage

– Remote Compute Nodes accessing the shared storage

Page 12: Emc

12

IDC HPC User Forum 2008

SSD Disks in HPC applications

Solid State Disks will replace Disk Drives– Today HPC workloads are mostly compute

intensive– Data intensive Super Computing (DISC)

applications start to appear (see: IEEE Computer Magazine, April 2008)

– SSD will balance performance between DISC and compute intensive HPC applications

– EMC DMX has SSD today (25 SSD = 800K iops or 5 GB/sec) SAN

EMC

HPC Application platform support

DMX + SSD

0

0.5

1

1.5

2

Pric

e/Pe

rfor

man

ce

ext2 ext3 Reiserfs DualFS

Performance Normalized to Cost

Page 13: Emc

13

IDC HPC User Forum 2008

pNFS addresses the storage access issues

– Remove servers layer between CE and shared storage

– Separates MD traffic from Data Traffic

– Asymmetric storage architectures increase scalability

– SSD increase I/O speed

HPC Architecture

SSD STORAGE

CONNECTIVITY

MIDDLEWARE

NFS S E R V E R S

HPC Jobs

Storage must be Networked

Compute Engines

CONNECTIVITY

pNFS

pNFS will deliver very high I/O speeds to HPC

Page 14: Emc

14

IDC HPC User Forum 2008

MD is directed to the single MD server

Data is served by storage servers or storage arrays directly from host to storage

Storage access controlled by iSCSI

I/O to native IB or 10G storage redirected via RDMA in HW

iSCSI (iSER) NFS (pNFS)

Storage array

NFS/pNFS

File systems

Data path

Control path

Native IB Storage Array Cache

MetaData Cache

CE Cache

RD

MA

pNFS with Infiniband RDMA value added to HPC

Page 15: Emc

15

IDC HPC User Forum 2008