IBM HPC/HPDA/AI Solutions - HPC Knowledge Portal · 2020. 5. 22. · (Oracle,ERP, HPC Cluster)...

IBM HPC/HPDA/AI

Solutions

Albert Valls Badia IBM Client Technical Architect

IBM Systems Hardware

Albert_Valls@es.ibm.com

June 15th , 2017

New Drivers and Directions – Datacentric

• Data Volumes are Exploding – Especially Unstructured Data

• Data Needs to e Colle ted, Ma aged, a d Digested

• Deriving Insight and Information from the Data requires:

• A variety of pro essi g steps i a Workflo

• A variety of processing optimizations

• Many Analytics Steps can make use of Large In Memory

Solvers

• Energy Efficiency requires:

• Processing Elements that are Optimized to the task

• Energy and Data aware Workflow Management

• The Open Power Foundation provides innovation

opportunities to a variety of Partners

• Making innovations like A elerators Co su a le is

critical

Full system stack innovation required

Technology

Processors

Firmware / OS

Accelerators Software Storage Network

Workflow

Dependency Graph

OpenPOWER and Innovation (strategy started in 2014)

IBM Stack

Research

Innovation

Google

NVIDIA

Mellanox OpenPower

Open Innovation

OpenPOWER: Bringing Partner Innovation to Power Systems

5 initial members

200+ members

24 countries

OpenPOWER Innovation Pervasive in System Design (21 TFlops/node)

NVIDIA:

Tesla P100 GPU with NVLink

NVLink Interface

Ubuntu by Canonical:

Launch OS supporting NVLink and Page

Migration Engine

Wistron: Platform co-design

Mellanox: InfiniBand/Ethernet

Connectivity in and out of server

Samsung:

2.5” SSDs

HGST: Optional NVMe Adapters

Hynix, Samsung, Micron: DDR4

IBM: POWER8 CPU

POWER8: Leadership performance - designed for Memory Intensive Workloads

Memory Buffer

DRAM Chips

POWER8

12 cores 96 threads 4 cache levels

Up to 1/2 TB per socket Up to 230 GB/s sustained

Consistent speed

Faster cores

8 Threads per Core

Bigger cache

Accelerator direct

3x higher memory

bandwidth,

1 TB/Socket

Differentiated Acceleration - CAPI and NVLink

New Ecosystems with CAPI

Partners innovate, add value,

gain revenue together w/IBM

Technical and programming

ease: virtual addressing, cache

coherence

Accelerator is hardware peer FPGA or ASIC

NVIDIA Tesla GPU with NVLink

POWER8

with NVLink

80 GB/s

Graphics Memory Graphics Memory

System Memory

40+40 GB/s

Coherence Bus

POWER8

CAPI-attached Accelerators

Future, Innovative Systems with NVLink

Faster GPU-GPU communication

Breaks down barriers between CPU-GPU

New system architectures

IBM Power Accelerated Computing Roadmap

2015 2016 2017

POWER8 POWER8 with NVLink

POWER9

Interface

NVLink

Enhanced

CAPI &

NVLink

ConnectX-4 EDR Infiniband

PCIe Gen3

ConnectX-4 EDR Infiniband

CAPI over PCIe Gen3

HDR Infiniband Enhanced CAPI over PCIe Gen4

Mellanox Interconnect Technology

IBM CPUs

NVIDIA GPUs Kepler

PCIe Gen3 Volta

Enhanced NVLink Pascal NVLink

S822LC – Firesto e

Server

S8 LC for HPC Mi sk

POWER10

Witherspoo

System Name TBD

7/3/2017 8

FLOPS are not the only PKI in HPC: example workflow in seismic analysis.

• Read from storage

• Memory load

• Preporcessing

• Realtime algorithm execution

• Visualization and Insight

• Simulation and modeling Every step in the workflow takes advantage of different

hardware capabilities. Therefore the need for a balanced

system design.

IBM Data Centric Computing Strategy: HPC->HPDA

Introducing IBM Spectrum Scale

• Remove data-related bottlenecks with a parallel, scale-out solution

• Enable global collaboration with unified storage and global namespace

• Optimize cost and performance with automated data placement

• Ensure data availability, integrity and security with erasure coding, replication, snapshots, and encryption

Highly scalable high-performance unified storage

for files and objects with integrated analytics

Unified Scale-out Data Lake

• File In/Out, Object In/Out; Analytics on demand.

• High-performance native protocols

• Single Management Plane

• Cluster replication & global namespace

• Enterprise storage features across file, object & HDFS

Spectrum Scale

SSD Disk

Fast Disk

Slow Disk

Compression

NFS SMB POSIX Swift/S3 HDFS

Encryption

SSD Disk

Fast Disk

Slow Disk

IBM Spectrum Scale: Parallel Architecture

No Hot Spots

• All NSD servers export to all clients in active-active mode

• Spectrum Scale stripes files across NSD servers and NSDs in units of file-system block-size

• File-system load spread evenly

• Easy to scale file-system capacity and performance while keeping the architecture balanced

NSD Client does real-time parallel I/O

to all the NSD servers and storage volumes/NSDs

NSD Client

NSD Servers

Storage Storage

Ethernet Network (TCP/IP) or Low Latency Network (Infiniband)

Heterogeneous Block

Storage

Block Storage

IBM Elastic Storage

Solution

Spectrum Scale Native RAID Controllers Spectrum Scale File

Servers

Commodity Servers

(x86_64 or Power)

Application Nodes

(Oracle,ERP, HPC

Cluster)

Spectrum Scale Clients

/file_systemA

Spectrum Scale Protocol

NFS, SMB, OpenStack

NFS exports

SMB Shares

HTTP GET/PUT (Swift)

Spectrum Scale NSD Protocol

NFS Clients

SMB Clients

OpenStack Swift Clients

Clustered

Failover

Up to 16 (SMB) or

32 (NFS) servers

Servers use Disk

Volumes/LUNs

File-system load spread

evenly across all the

servers. No Hot Spots

Data is stripped across

servers in block-size

No single-server

bottleneck

Can share access to

data with NFS, SMB and

Swift S3

Easy to scale while

keeping the architecture

balanced

Can add capacity and

performance

/file_systemA

Spectrum Scale Cluster Overview

Spectrum Scale Architecture Highlights: Scalability

Data scalability

Capacity: Large number of disks/LUNs in a single file system

Throughput: wide striping, large block size

Capacity efficient (data in i-node, fragments)

Multiple nodes write in parallel (even within single file)

Metadata scalability

Wide striping of all metadata (inodes, indirect blocks, directories, allocation maps...)

Scalable data structures: Segmented allocation map,

Extensible hashing for directories

Highly scalable, distributed lock manager:

After o tai i g lo k toke , each node can cache metadata, update locally, write back directly

Fine-grain locking, when necessary: shared inode write locks, byte-range locks

lock directory entries by name (hash)

Dy a i ally ele ted metanode collects inode, ind block & directory updates

Speed and simplicity: Graphical user interface

• Reduce administration overhead • Graphical User Interface for common tasks

• Performance monitoring

• Problem determination

• Easy to adopt • Common IBM Storage UI Framework

• Integrated into Spectrum Control • Storage portfolio visibility

• Consolidated management

• Multiple clusters

Spectrum Scale Built-in Tiering (ILM) Challenge

• Data growth is outpacing budget

• Low-cost archive is another storage silo

• Flash is u der utilized e ause it is t shared

• Lo ally atta hed disk a t e used ith e tralized storage

• Migration overhead is preventing storage upgrades

• Automated data placement

• Span entire storage portfolio, including DAS, with a single namespace

• Policy driven data placement & data migration

• Share storage, even low-latency flash

• Automatic failover and seamless file-system recovery

• Lower TCO

• Powerful policy engine

• Information Lifecycle Management

• Fast etadata s a i g a d data o e e t

• Automated data migration to based on threshold

• Users not affected by data migration

• Example: Online storage reaches 90% full then move all 1GB or larger files that are 60 days old to offline to free up space

Small files last accessed > 30 days

last accessed > 60days

Silver pool is >60% full Drain it to 20%

accessed today and file size is <1G

System pool

(Flash)

Gold pool

Silver pool

( NL SAS)

Automation

Spectrum Scale HDFS Transparency

Challenge • Separate storage systems for ingest, analysis, results

• HDFS requires locality aware storage (namenode)

• Data transfer slows time to results • Different frameworks & analytics tools use data

differently

• HDFS Transparency

• Map/Reduce on shared, or shared nothing storage

• No waiting for data transfer between storage systems

• Immediately share results • Si gle Data Lake for all appli atio s • Enterprise data management • Archive and Analysis in-place

Existing System

Analytics

System Data

ingest

Export

result

Traditional Analytics

Solution

Existing System

Spectrum Scale File System File Object

Analytics

System

Transparency

In-place Analytics Solution

Spectrum Scale Compression

• Transparent compression for HDFS transparency, Object, NFS, SMB and POSIX interface.

• Improved storage efficiency

• Typically 2x improvement in storage efficiency

• Improved I/O bandwidth

• Read/write compressed data reduces load on storage

• Improved client side caching

• Caching compressed data increases apparent cache size

• Per file compression

• Use policies

• Compress cold data

– Data not being used/accessed

Spectrum Scale Encryption

• Native Encryption of data at rest

• Files are encrypted before they are stored on disk

• Keys are never written to disk

• No data leakage in case disks are stolen or improperly decommissioned

• Secure deletion

• Ability to destroy arbitrarily large subsets of a file system

• No ‘digital shredding’, no overwriting: Security deletion is a cryptographic operation

• Use Spectrum Scale Policy to encrypted (or exclude) files in fileset or file system

• Generally < 5% performance impact

Benefits

• Expands local node file cache (Pagepool)

• Leverages fast local storage

• Can reduce load on central storage

• Transparent to applications

• Can use inexpensive local devices

Where to use it

• Protocol Node

• Virtual Machine storage

• Large Memory Analytics

Easy to enable

NSD Type localCache

Define only this node as NSD server

LROC LROC

Application Nodes

Performance Feature Spectrum Scale Local Read-Only Cache (LROC)

Benefits

• Speeds-up small writes

• Used by IBM Elastic Storage Server

Where to use it

• Logs handle small writes

• Any storage architecture

• Shared Disk

• Shared Nothing (Use replication)

• IO Sizes up to 64KiB

Easy to enable

Create a system.log pool

Enable write-cache on the file system

Application Nodes

Performance Feature Spectrum Scale Highly Available Write Cache (HAWC)

Local Storage

Shared Storage

Spectrum Scale Multicluster: cross-cluster sharing

• Cross-mounting file systems between Spectrum Scale clusters

• Separate clusters = separate administration domains

• When connection is established, all nodes are interconnected

– All nodes in both clusters must be within same IP network segment / VLAN

– Channel can be encrypted (openssl)

Synchronous Replication & Stretched Cluster

• Performed synchronously by the node who writes to disk

• Synchronous replication happens within Spectrum Scale cluster

• I/O it does not return to the application until both copies are written

• Active/Active data access

• Read from fastest source

• DR with automatic failover and seamless file-system recovery

• If replication between sites -> Spectrum Scale Stretched Cluster

Synchronous

replication

Application

Whichever

is fastest

Spectrum Scale Active File Management (AFM) • An asynchronous, cross-cluster, data-sharing utility

• Functions well over unreliable and high latency networks

• Extends global name space between multiple WAN dispersed locations to share and exchange data asynchronously

• Ca hes lo al opies of data distri uted to o e or ore lusters to i pro e lo al read and write performance

• As data is written or modified at one location, all other locations see that same data

Spectrum Scale AFM Main Concepts

• Home - Where the information lives. Owner of the data in a cache relationship

• Cache - Fileset in a remote cluster that points to home

• The relationship between a Cache and Home is one to one

• Cache knows about its Home. Home does not know a cache exists

• Data is copied to the cache when requested or data written at the cache is copied back to home as fast as possible

Spectrum Scale Server 1

Clients

FDR IB

10/40 GbE

IBM Elastic Storage Server (ESS) is a Software Defined Solution

Migrate RAID

and disk

management

to commodity

file servers !

Custom dedicated

Disk Controllers

JBOD Disk

enclosures

Clients

Spectrum Scale RAID

Commodity file

servers

FDR IB

10/40 GbE

JBOD Disk

enclosures

Spectrum Scale RAID

Commodity file

servers with

RAID and disk

management

Spectrum Scale Native RAID is a software implementation of storage RAID technologies within

Spectrum Scale.

It requires special Licensing

It is only approved for pre-certified architectures such as Lenovo-GSS, IBM-ESS (Elastic Storage

Server)

Advantages of Spectrum Scale RAID

• Use of standard and i e pe sive disk drives • Erasure Code software implemented in Spectrum Scale

• Data is declustered and distributed to all disk drives with selected RAID protection

• 3-way, 4-way, RAID6 8+2P, RAID6 8+3P

• Faster rebuild times • As data is declustered, more disks are involved during rebuild

• Approx. 3.5 times faster than RAID-5

• Minimal impact of rebuild on system performance • Rebuild is done by many disks

• Rebuilds can be deferred with sufficient protection

• Better fault tolerance • End to end checksum

• Much higher mean-time-to-data-loss (MTTDL)

Spectrum Scale RAID

RAID algorithm • Two types of RAID:

• 3 or 4 way replication

• 8 + 2 or 3 way parity

• 2-fault and 3-fault tolera t odes RAID-D2, RAID-D3

3-way Replication (1+2) 8 + 2p Reed Solomon 2-fault

tolerant

3-fault

tolerant

1 strip

block)

2 or 3

replicated

strips

4-way Replication (1+3)

8 strips

(GPFS block)

2 or 3

redundancy

strips

8 + 3p Reed Solomon

Rebuild overhead reduction example

Declustered RAID6 example

Critical Rebuild Performance on GL6 8+2p

Spectrum Scale RAID

As one can see

during the critical

rebuild impact on

workload was high,

but as soon as we

were back to a

single parity

protection the

impact to the

customers

workload was <2%

Data Integrity Manager

prioritizes tasks:

Rebuild, Rebalance,

Data scrubbing and

proactive correction

6 minutes for a critical rebuild

End-to-end checksum • True end-to-end checksum fro disk surfa e to lie t s Spe tru S ale i terfa e

• Repairs soft/latent read errors

• Repairs lost/missing writes.

• Checksums are maintained on disk and in memory and are transmitted to/from client.

• Checksum is stored in a 64-byte trailer of 32-KiB buffers • 8-byte checksum and 56 bytes of ID and version info

• Sequence number used to detect lost/missing writes.

8 data strips 3 parity strips

32-KiB buffer

64B trailer

¼ to 2-KiB

terminus

IBM Elastic Storage Server family GS odels use U . JBODs or SSDs

Support drives: . TB, .8TB SAS, GB, 8 GB, . TB SSD .

GL odels use U . JBODs

Support drives: 4TB,6TB,8TB NL-SAS . HDDs

Supported NICs: 10GbE, 40GbE Ethernet and FDR or EDR Infiniband

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Net Capacity

4TB = 327TB

6TB = 491TB

8TB = 655TB

Net Capacity

4TB = 673TB

6TB = 1PB

8TB = 1.3PB

Net Capacity

4TB = 1PB

6TB = 1.5PB

8TB = 2PB

Model GL4

Analytics and Cloud 4 Enclosures, 20U

232 NL-SAS, 2 SSD

10 to 16 GB/Sec

Model GL6

PetaScale Storage 6 Enclosures, 28U

348 NL-SAS, 2 SSD

10 to 25 GB/sec

Model GL2

Analytics Focused 2 Enclosures, 12U

116 NL-SAS, 2 SSD

5 - 8 GB/Sec

Model GS1 24 SSD

6 GB/Sec

Model GS2 46 SAS + 2 SSD or

48 SSD Drives

2 GB/Sec SAS

12 GB/Sec SSD

Model GS4 94 SAS + 2 SSD or

96 SSD Drives

5 GB/Sec SAS

16 GB/Sec SSD

Model GS6 142 SAS + 2 SSD

7 GB/Sec

Net Capacity

1.2TB = 121TB

1.6TB = 182TB

Net Capacity

400GB = 28TB

800GB = 57TB

1.6TB = 115TB

1.2TB = 78TB

1.6TB = 117TB

Net Capacity

400GB = 13TB

800GB = 26TB

1.6TB = 53TB

1.2TB = 35TB

1.6TB = 53TB

Net Capacity

400GB = 6TB

800GB = 13TB

1.6TB = 26TB

ESS New Models Performance and Capacity

Spectrum

New! Model GL2S: 2 Enclosures, 14U 166 NL-SAS, 2 SSD

5U84 Storage

Max: .9PB raw Max: 1.6PB raw Max: 1.8PB raw Max: 3.3PB raw Max: 2.8PB raw Max: 5PB raw

Model GL2: 2 Enclosures, 12U 116 NL-SAS, 2 SSD

Storage

5U84 Storage

Storage

5U84 Storage

Storage

34 GB/s

25 GB/s

17 GB/s

11 GB/s

8 GB/s

23 GB/s

Net Capacity

4TB = 1.5PB

8TB = 3.1PB

10TB = 3.9PB

Net Capacity

4TB = 1PB

8TB = 2PB

10TB = 2.5PB

Net Capacity

4TB = 508PB

8TB = 1PB

10TB = 1.27PB

Sequential throughput vs. Capacity

Software Defined Compute: IBM Platform Computing Delivering a highly utilized shared services environment optimized for time to results

Application Examples

• Simulation

• Analysis

• Design

• Big data

IT constrained

• Long wait times

• Low utilization

• IT Sprawl

IBM Platform Computing

Big Data /

Hadoop

Simulation

& Modeling Analytics

Traditional Software Defined

Benefits

• High utilization

• Throughput

• Performance

• Prioritization

• Reduced cost Repeated for many apps and groups

• Clusters

• Grid

• Cloud

Faster results

Fewer resources

Long Running

Services Make lots of computers look like “one”

Prioritized matching of supply with demand

Application

Overall Artificial Intelligence (AI) Space

Machine Learning

Deep Learning IT Systems break tasks into

Artificial Neural Networks

New Data

Sources:

NoSQL,

Hadoop &

Analytics

New class of applications

Machine Learing & Training

Pattern matching

Real-time decision support

Complex workflows

Data Lakes

Extend Enterprise applications

Finance: Fraud detection /

prevention

Retail: shopping advisors

Healthcare: Diagnostics and

treatment

Supply chain and logistics

Extend Predictive Analytics to

Advance Analytics with AI

Human Intelligence Exhibited by Machines

Cognitive / ML/DL

“Human Trained” using large amounts of data & ability to learn how to perform the

Growing across Compute, Middleware, and Storage

PowerAI Platform

Caffe NVCaffe Torch IBMCaffe

DL4J TensorFlow

OpenBLAS

Theano

Deep Learning

Frameworks

Accelerated

Servers and

Infrastructure

for Scaling

Spectrum Scale:

High-Speed

Parallel File System

Scale to

Cluster of NVLink

Servers

Coming Soon

Bazel DIGITS NCCL Distributed

Frameworks

Supporting

Libraries

Where to start?

• 20 x POWER8 cores with NvLink • Hasta 1TB DDR4 Mem with NvLink • Hasta Tesla P ’s . cores

Parallel Computing

Ej. Universidad Carlos III

Barcelona Supercomputing Center

GPU development

And optimisation

Ej. Molecular dynamics.

Centro de Biología Molecular

Machine Learning

Deep Learning

20 Core POWER8 + 256GB + 1

GPU Nvidia Volta

Starting at 27.500 € + IVA

IBM Power System S822LC

The Deep Learning Server

Questions?

IBM HPC/HPDA/AI Solutions - HPC Knowledge Portal · 2020. 5. 22. · (Oracle,ERP, HPC Cluster)...

Documents

Transcript of IBM HPC/HPDA/AI Solutions - HPC Knowledge Portal · 2020. 5. 22. · (Oracle,ERP, HPC Cluster)...

...SEL 3113 NED 3225 HPC 3340 HPC 3432 Clave NED 1221 Aprendizajes Societales NED 1322 de la Educación HPC 1532 Introducción a la Literatura HPC 1633 Lingüística General HPC 1734

Research Highlights In HPC, HPDA-AI, Cloud Computing ......PUBLIC CLOUD: Hyperion Research defines a public cloud as a computing resource that is offered by a third-party (cloud services

NISSO HPC - dalian-diligence.com Catelog (English) final.pdf · NISSO HPC for Pharmaceutical Applications Contents Introduction Features of NISSO HPC Major Application of NISSO HPC

Deploying HPC forDeploying HPC for Interactive Simulation

HYPERION RESEARCH UPDATE: Research Highlights In HPC, … · 2020-01-09 · HYPERION RESEARCH UPDATE: Research Highlights In HPC, HPDA-AI, Cloud Computing, Quantum Computing, and

High-performance Data Analytics (HPDA) at the MPCDF

HPC IN CONTAINERS - NVIDIAon-demand.gputechconf.com/...hpc-in-containers-why-containers-wh… · GTC’18: HPC Containers 24 BUILDING AN HPC APPLICATION IMAGE 1. Use the HPC base

romeoLAB : HPC Training Platform on HPC facility Archive/tech_poster...romeo LAB: HPC Training Platform on HPC facility The ROMEO HPC Center - Grand-Est region, France - is a High

HPDA-5i H PERFORMANCE DISTRIBUTION AMPLIFIER OPERATING MANUAL · The HPDA-5i is a High Performance Distribution and Isolation Amplifier with performance exceeding that required to

HPC Cloud Bad; HPC in the Cloud Good

HPC Midlands Update for HPC-SIG July 2013

HPC in Japan - HPC User Forum

< QC | HPC >: Quantum for HPC

IDC HPC Update at ISC’14 - HPC Today · 1. HPC Market Update 2. IDC’s Top 10 HPC Predictions for 2014 3. International HPC Activities -- The HPC Market Across Europe 4. HPDA Challenges

Storage for HPC, HPDA and Machine Learning (ML) · · 2017-07-17Storage for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect ... •Map/Reduce on shared,

Spectrum Scale auf Amazon AWS — Harald Seipp STSM, CoE for ... · Skill Transfer, New Product Intro., Solution Enablement, Architectural Guidance Lab Services ... Targeted for HPC

An HPC Certiﬁcation Program Proposal Meeting HPC Users’ … · Draft Version 0.9 – February 1, 2018 An HPC Certiﬁcation Program Proposal Meeting HPC Users’ Varied Backgrounds

Parallel Computing. Effect of HPC in the final solution...•“Static Structural” + ”Modal Analysis” + ”Response Spectrum” •62,439 nodes, 150,169 elements Result/Benefit

The UberCloud - From Project to Product - From HPC Experiment to HPC Marketplace - From HPC Shop to HPC Shopping Mall

Hematopoietic*Progenitor*Cells(HPC ... · Hematopoietic*Progenitor*Cells(HPC)* NovelparameterinHematologyAnalyzer* * ...

HematopoieticProgenitorCells(HPC ... · HematopoieticProgenitorCells(HPC)* NovelparameterinHematologyAnalyzer* * ...