Optimizing Lustre and GPFS with DDN

15
Optimizing Lustre and GPFS Solutions with DDN Robert Triendl VP of Worldwide HPC Strategy, DataDirect Networks

Transcript of Optimizing Lustre and GPFS with DDN

Optimizing Lustre and GPFS Solutions with DDN

Robert Triendl VP of Worldwide HPC Strategy, DataDirect Networks

2!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

File Systems @ DDN

3!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

File System Basics

• File system are where your data lives

• File systems are complex software level technologies…

• … so there are always surprises!

• There are huge differences in performance, functionality, and reliability

• When it comes to performance, no file system fits all requirements

4!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

Test and Benchmark Labs

5!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

GTLS | Benchmark Lab Sites

EMEA Lab Dusseldorf, Germany

Asia Pacific Lab Tokyo, Japan

East Coast Lab Columbia, MD

West Coast Lab Sunnyvale, CA

6!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

DDN and Lustre

• Started with Lustre 0.6, and the first commercial Lustre support contract with CFS!

• Over 250 EXAScaler customers worldwide today and many more using DDN storage for Lustre

• Customers in many industries (HPC centers, Large Experimental Facilities, Oil & Gas, Life Science, Automotive, etc.)

• Very broad set of applications supported

7!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

Corp Data 4%

Government Security 17%

Research Data

Analysis, 28% HPC

Archive 18%

HPC Work 20%

HPC Work Corp 12%

Project Quota

Metadata Perf

SSD Acceleration

Fine-Grained Monitoring

NFS/CIFS Access

Management

Connectors

Object/Cloud Links

Data Management

Backup/Replication

HSM

Client Performance

Cluster Integration

Large I/O

IME Caching

Security Features

Lustre WAN

RAS

Small File I/O

8!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

DDN Open Source Lustre Contributions

0

20

40

60

80

100

120

140

160

180

2.1 2.4.0 2.3.50-2.4.0 2.5.0 2.5.50-2.6.0

EMC

CEA

SUSE

Bull

Other

Cray

LLNN

Xyratex

DDN

9!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

Large RPC Size Effects

0%!

20%!

40%!

60%!

80%!

100%!

120%!

0! 100! 200! 300! 400! 500! 600! 700!Number of Process!

WRITE!7.2KSAS(1MB RPC)! 7.2KSAS(4MB RPC)! SSD(1MB RPC)!

0%!

20%!

40%!

60%!

80%!

100%!

120%!

0! 100! 200! 300! 400! 500! 600! 700!Number of Process!

READ!7.2KSAS(1MB RPC)! 7.2KSAS(4MB RPC)! SSD(1MB RPC)!

10!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

•  Limited  single  client  scaling  

•  Good  scaling  with  clock  speed  

•  Good  Scaling  with  core  count  and  HT  

•  Great  Scaling  with  DNE  •  Limita<ons  on  Dir  Creates  (TBD)  

Lustre Metadata

11!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

mmap() I/O Performance Improvements

0!

100!

200!

300!

400!

500!

lustre-1.8.9! lustre-2.5.2! DDN branch!

mmap() Read Performance !(1MB block size)!

0!

100!

200!

300!

400!

500!

32K! 128K! 512K! 1024K!

mmap() Read Performance!

Lustre-1.8.9! DDN branch!

12!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

EXAScaler Monitoring

OSS/MDS !

collectd !

Lustre  client!DDN  monitoring    plugin !

graphite!

Monitoring  Server!

collectd !

Graphite  plugin !

UDP(TCP)/IP    based  small    text  message    transfer graphite!

•  Lightweight  •  Near  real-­‐<me  •  Massive  scale  •  Customizable    

•  File system, OST Pool, OST/MDT stats, etc. •  JOB ID, UID/GID, aggregation of application's

stats, etc. •  Archive of data by policy

13!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

EXAScaler Monitoring

• Running in TITECH –  over 112 Object Storage

Targets across –  1700 clients

• That’s around 1M statistics

• Need to store every few seconds

• Demo of over 10M stats at DDN Booth

14!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

VMs on GRIDScaler 256 VMs on 16 Clients

0!

2000!

4000!

6000!

8000!

10000!

12000!

14000!

16000!

1! 2! 4! 8! 16! 32! 64! 128! 256!

Thro

ughp

ut (M

B/se

c)!

Number of Process!0!

2000!

4000!

6000!

8000!

10000!

12000!

1! 2! 4! 8! 16! 32! 64! 128! 256!

Thro

ughp

ut (M

B/se

c)!

Number of Process!

15!

© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com

0!

100!

200!

300!

400!

500!

600!

700!

800!

900!

1000!

1! 10! 20! 30! 40!

Total Bandwidth!

Read Bandwidth! Write Bandwidth!

GRIDScaler for OpenStack vbench Results

0!

1000!

2000!

3000!

4000!

5000!

6000!

1! 10! 20! 30! 40!

Total IOPS!

Read IOPS! Write IOPS!