ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node...

10
ARM + DSP Supercomputer Modular HPEC Architectures n C ore

Transcript of ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node...

Page 1: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

ARM + DSP Supercomputer Modular HPEC Architectures

nCore

Page 2: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

System Overview

The nCore BrownDwarf Y-Class system unifies COTS technologies, high performance SoCs, advanced low latency interconnects, and optimized software to create a supercomputer delivering exceptional performance, reliability, power telemetry, reconfigurability, and programmability at significantly reduced power levels.

2More information can be found here: http://ncorehpc.com/browndwarf/

The modular architecture lends itself well to design, development and deployment of HPEC systems for military and aerospace applications, medical imaging, biomedical & genomic research, oil & gas exploration, physics simulations and power vs. performance research.

Page 3: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

Y-Class AMC Block Diagram

3

• TI 66AK2H12 “Keystone 2” • 4 x ARM A15 @ 1.4Ghz • 24 x C66 DSP @ 1.2Ghz • 51.2GB/s Total Memory

Bandwidth • 26GB ECC Memory • 2TB/s Internal Bus • 100Gb/s Hyperlink • 20Gb/s SRIO Compute Fabric • 10Gb Ethernet System Fabric • 3 x 1Gb Ethernet OBM Fabric

ARM A15

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

C661M/32/32

8GB DDR31600

10GbE

1GbE20GbSRIO

12.8GB/s

4M L2

4M Shared Cache 6M Shared Cache

4M Shared Cache

8GB DDR31600

8GB DDR31600

12.8GB/s 12.8Gb/s

2GB DDR3

2TB/s

50Gb/sHyperlink 2TB/s

ARM A15 ARM A15 ARM A15

50Gb/sHyperlink

12.8GB/s

4 x 5Gbaud

10Gb/s

1Gb/s

MSMC MSMC

MSMC

Compute Fabric - MPI

System Fabric

OBM Fabric

Page 4: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

BrownDwarf Y-Class System Cabinet

4

x 4 + x 12 =

Y-Class AMC Node

Carrier Blade

2TFLOPS SP / 2TOPS Integer 104GB ECC Memory

Y-Class AMC Node

Carrier Blade

Switch Blade

Applications can use a single AMC node or scale to hundreds of nodes while interfacing with any ATCA or uTCA component

Page 5: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

Military/Aerospace/Data Acquisition Architecture

5

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk6TB

SSD

SAS

Ba

ck

pla

ne

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

Storage Blade

10GbE Switch

A/D

FPGA

A/D

FPGA

A/D

FPGA

A/D

FPGA

A/D

FPGA

A/D

FPGA

A/D

FPGA

A/D

FPGA

SRIO Switch

120Gbps SRIO

11 TFLOPS

624GB ECC

1.2TB/s MBW

380Gbps SRIO

576 DSP Cores

Sensors

Video RadarVideo Radar

sFPDP 1553

Etc.

SRIO - 80Gbs

10GbE

1GbE

Page 6: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

DPI and Big Data Architecture

6

40GbE Packet Processor Blade

2 x Cavium Octeon II CN6880

SAS

SAS

SRIO - 80Gbs

10GbE

1GbE

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

Storage Blade

Storage Blade

40GbE Packet Processor Blade

SRIO Switch

120Gb SRIO

11 TFLOPS

624GB ECC

1.2TB/s MBW

480Gbps SRIO

10GbE span/inline/tap

n x 10GbE

40GbE Switch

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk6TB

SSDDiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk6TB

SSD

Ba

ck

pla

ne

Page 7: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

H.264 Transcoding Architecture - 6 Slot Variant

7

H.264 BPEncoding

8 x C66 DSP CoresDecoding

8 x C66 DSP CoresEncoding

1 x BrownDwarf BladeDecoding

1 x BrownDwarf BladeEncoding

3 x BrownDwarf BladeDecoding

3 x BrownDwarf BladeCIF/30 48 104 576 1248 1728 3744D1/30 12 24 144 288 432 864720p30 4 8 48 96 144 288

720p60/or/1080p30 2 4 24 48 72 1441080p60 1 2 12 24 36 72

H.264 HPEncoding

8 x C66 DSP CoresDecoding

8 x C66 DSP CoresEncoding

1 x BrownDwarf BladeDecoding

1 x BrownDwarf BladeEncoding

3 x BrownDwarf BladeDecoding

3 x BrownDwarf BladeD1/30 4 8 48 96 144 288

720p60/or/1080p30 1 2 12 24 36 721080p60 0.5 1 6 12 18 36

H.265 (Standard Quality)Encoding

8 x C66 DSP CoresDecoding

8 x C66 DSP CoresEncoding

1 x BrownDwarf BladeDecoding

1 x BrownDwarf BladeEncoding

3 x BrownDwarf BladeDecoding

3 x BrownDwarf Blade720p30 1 0.25 12 48 36 1441080p30 2 0.5 6 24 18 721080p60 4 1 3 12 9 36

H.265 (High Quality)Encoding

8 x C66 DSP CoresDecoding

8 x C66 DSP CoresEncoding

1 x BrownDwarf BladeDecoding

1 x BrownDwarf BladeEncoding

3 x BrownDwarf BladeDecoding

3 x BrownDwarf Blade720p30 2 0.25 6 48 18 1441080p30 4 0.5 3 24 9 724kp30 16 2 0.75 6 2.25 18

SAS

Ba

ck

pla

ne

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

Storage Blade

SRIO Switch

120Gb SRIO

4.6 TFLOPS

312GB ECC

1.2TB/s MBW

240Gbps SRIO

n x 10GbE

10/40GbE Switch

SRIO - 80Gbs

10GbE

1GbE

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk6TB

SSD

Page 8: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

H.265 Transcoding Architecture - 6 Slot Variant

8

H.265 (Standard Quality)Encoding

8 x C66 DSP CoresDecoding

8 x C66 DSP CoresEncoding

1 x BrownDwarf BladeDecoding

1 x BrownDwarf BladeEncoding

3 x BrownDwarf BladeDecoding

3 x BrownDwarf Blade720p30 1 0.25 12 48 36 1441080p30 2 0.5 6 24 18 721080p60 4 1 3 12 9 36

H.265 (High Quality)Encoding

8 x C66 DSP CoresDecoding

8 x C66 DSP CoresEncoding

1 x BrownDwarf BladeDecoding

1 x BrownDwarf BladeEncoding

3 x BrownDwarf BladeDecoding

3 x BrownDwarf Blade720p30 2 0.25 6 48 18 1441080p30 4 0.5 3 24 9 724kp30 16 2 0.75 6 2.25 18

SAS

Ba

ck

pla

ne

BrownDwarf Y-Class

BrownDwarf Y-Class

BrownDwarf Y-Class

Storage Blade

SRIO Switch

120Gb SRIO

4.6 TFLOPS

312GB ECC

1.2TB/s MBW

240Gbps SRIO

n x 10GbE

10/40GbE Switch

SRIO - 80Gbs

10GbE

1GbE

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk6TB

SSD

Page 9: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

High Performance Computing Architecture

9

9.7kw / 110v

SRIO - 80Gbps

1GbE

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

SRIO/1GbE Switch

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

120Gbps

SRIO

120Gbps

SRIO

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

SRIO/1GbE SwitchSRIO/1GbE Switch

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

120Gbps

SRIO

120Gbps

SRIO

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

SRIO/1GbE SwitchSRIO/1GbE Switch

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

120Gbps

SRIO

120Gbps

SRIO

Storage Blade Storage Blade

SRIO Rack Switch

65 TFLOPS

3.3TB ECC

SRIO/1GbE Switch

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

BrownDwarf

Y-Class

Ba

ck

pla

ne

Ba

ck

pla

ne

Ba

ck

pla

ne

Page 10: ARM + DSP Supercomputer Modular HPEC Architectures - nCore … · 2017. 8. 14. · Y-Class AMC Node Carrier Blade 2TFLOPS SP / 2TOPS Integer 104GB ECC Memory Y-Class AMC Node Carrier

nCore Lithium Suite• nCore Lithium Suite is the fastest way to performance

and productivity on TI Keystone II and BrownDwarf

• Ubuntu ARM HPC centric server distribution enables access to 6.5k Linux Packages

• Native development environment on Keystone II for ARM & DSP using optimizing compilers

• Offload computations to C66x DSP cores using OpenMP 4.0 with accelerator model and OpenCL

• OpenMPI over SRIO, Optimized IPP replacement library for C66x, Advanced DMA library for C66x

• Performance Optimization Tool Layers (PAPI for A15 and C66x DSP), BLAS (ATLAS)

• Industry Leading Commercial Support

Currently Supported Platforms: - nCore BrownDwarf YCNODE - nCore BrownDwarf MBLADE - TI’s XTCIEVMK2X EVM - Others to follow

Li-HPC

10nCore is the worldwide leader in TI Keystone software technologies

ARM

Cortex A15

DSP

C66X DSP

L1 L2

C66X DSP

L1 L2

C66X DSP

L1 L2

C66x DSP

L1 L2

C66X DSP

L1 L2

C66X DSP

L1 L2

C66X DSP

L1 L2

C66x DSP

L1 L2

Ope

nMP

Acc

Mod

el/

Ope

nCL

Runt

ime

Linu

x Ope

nMP/

Ope

nMPI

Acce

lera

ted

Code

Appl

icatio

n

MC

SM +

Sha

red

Mem

ory

Tera

Net

Tera

Net

Cortex A15

Cortex A15

Cortex A15SRIO

20

Gbs

4 x 5Gbs Gen 2.1RapidIO

Code

Disp

atch

2 x

DDR3

72-B

it

2GB8GB

Num

erica

l Li

brar

ies