In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data...
Transcript of In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data...
![Page 1: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/1.jpg)
• In ~2011, we discovered an exciting new failure mode in large scale systems.
• We called this failure mode “Hadoop”. • Recipe for success: take a big HPC system
– A huge central filesystem – Optimized for large, sequential, access to multi GB to TB size files – With a highly tuned, low level C interface
• And on that run software that: – Assumes a small, massively distributed filesystem. – Optimized for very “small” files. – With an untuned, well… Java.
![Page 2: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/2.jpg)
To address these and other needs for the Data Driven
researcher, we built Wrangler
![Page 3: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/3.jpg)
Goals of the Wrangler Project • To address the data problem in multiple dimensions
– Big (and small), reliable, secure – Lots of data types: Structured and unstructured – Fast, but not just for large files and sequential access. Need high
transaction rates and random access too. • To support a wide range of applications and interfaces
– Hadoop, but not *just* Hadoop. – Traditional languages, but also R, GIS, DB, and other, perhaps less
scalable things by HPC standards. • To support the full data lifecycle
– More than a scratch (aka purged) file system – Metadata and collection management support
14*
![Page 4: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/4.jpg)
Wrangler
40 Gb/s Ethernet
100 Gbps Public
Network
GlobusInterconnect with1 TB/s throughput
IB Interconnect 120 Lanes(56 Gb/s) non-blocking
High Speed Storage System500+ TB1 TB/s
250M+ IOPS
Access &Analysis System
96 Nodes128 GB+ Memory
Haswell CPUs
Mass Storage Subsystem10 PB
(Replicated)
Access &Analysis System
24 Nodes128 GB+ Memory
Haswell CPUs
Mass Storage Subsystem10 PB
(Replicated)
TACC Indiana
15*
![Page 5: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/5.jpg)
16*
Half*of*the*Final*
Cabling*on*4*D5s*
6*of*the*10*D5s*
5*of*the*35*storage*
JBODs*and*Lustre*
Metadata*server*
12*of*120*compute*
Haswell*nodes*
![Page 6: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/6.jpg)
DSSD Storage • Not SSD ; a custom interface allows
access to the NAND flash technology performance without the overhead of the traditional “disk” interface.
• Not Flash storage tied to individual nodes – Multiple nodes have direct access to the
same storage via independent PCI – ~600 PB of usable space – 10s of GB/s IO to one node – Accessible across whole cluster
![Page 7: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/7.jpg)
Dynamic Storage Provisioning • The high speed storage will be visible in several ways:
– As a traditional filesystem, for traditional applications – As an HDFS filesystem, for Hadoop, Spark, and other HDFS
applications. – As a SQL database backend with high transaction rates – As an object store with a native API, for novel data applications
• In addition to our “traditional” HPC stack, we support R, databases, NoSQL databases, and the full Hadoop “suite”.
• Long term storage and cataloging leveraging the replicated 10PB file system
18*
![Page 8: In ~2011, we discovered an exciting new failure mode in ...files.meetup.com/14556262/Austin Data Meetup - TACC... · DSSD Storage • Not SSD ; a custom interface allows access to](https://reader036.fdocuments.us/reader036/viewer/2022071218/604f77e96012b415e00c4f9c/html5/thumbnails/8.jpg)
Moving Data “Effectively” • 2x100 GB/s Internet 2 connection… • Data Dock capability for the “last mile
problem” where limited by other networks.
3,000 Rice genomes from the Philippines; yes, this is the actual box
19*