Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable...

26
. @ Copyright 2019 Xilinx Xilinx Alveo Portfolio Expansion Alveo U50 Launch UNDER EMBARGO UNTIL TUESDAY, AUGUST 6, 2019 at 6 A.M PT/9 A.M. ET

Transcript of Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable...

Page 1: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Xilinx Alveo Portfolio ExpansionAlveo U50 Launch

UNDER EMBARGO UNTIL TUESDAY, AUGUST 6, 2019 at 6 A.M PT/9 A.M. ET

Page 2: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Expanding the Alveo Platform

2

Announcing Xilinx Alveo U50 - Industry’s First Adaptable

Compute, Networking, Storage Accelerator built for Any Server, Any Cloud

Broad and growing Alveo ecosystem of software partners and continued

enhancement of developer tools to scale up Alveo solutions

Dramatic improvements in throughput, latency and power efficiency

performance across a range of critical data center applications

Page 3: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

ACCESSIBLEDeploy in the cloud or on-premises

Rich set of accelerated Applications

FASTBuilt for high throughput, ultra-low latency

Accelerate compute, networking, storage

ADAPTABLEDeploy optimized domain-specific architectures

Adapt to changing algorithms

Page 4: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Over 10X improvement in performance and power

efficiency for critical data center applications

Most advanced, adaptable platform for

accelerating compute, storage, and networking

Ultra-efficient power envelope and form factor

make it flexible enough for any server, any cloud

Alveo U50: Industry’s First Adaptable Accelerator Built for Any Server, Any Cloud

4

Page 5: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Xilinx Alveo U50 Key Specifications

5

UltraScale+ Architecture

Low-profile form factor

8GB HBM2 Memory, 460GB/sec

QSFP 28 (100GbE)

PCIe Gen4, CCIX, PCIe Gen3

< 75W

Page 6: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx6

Compute StorageNetworking

Accelerating the Data Center

Page 7: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Domain Specific Architectures

7

AlexNet GoogLeNet DenseNet

Highest throughput, latency, and efficiency requires different HW architecture

Page 8: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Optimized Performance

8

Requires Custom Memory and Datapath

1) Custom Data Path2) Custom Precision

3) Custom Memory Hierarchy

Off-Chip DDR

On-Chip Memory

Page 9: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

ALVEO Solution Stack

9

CLOUD ON-PREMISE

Solution Providers

App & IP Developers

ChannelPartners

End Customers

Tencent Cloud

Data Analytics

Video & Image Processing

Machine Learning

Financial Computing

Life Science& HPC

Page 10: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Xilinx Powered Hyperscale Cloud Data Centers

Growing Cloud Availability

Page 11: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Growing Ecosystem

11

Life Sciences & HPC Video ProcessingData Analytics

Machine Learning Financial Computing Image Processing

Page 12: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Growth Since October 2018 Launch

12

Life Sciences & HPC Video ProcessingData Analytics

Machine Learning Financial Computing Image Processing

Published

Applications

2xDevelopers

Trained

4x

Page 13: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Deployment Stack for Scale

13

CLOUD ALVEO INSTANCES ON-PREM ALVEO SERVERS

XILINX ALVEO

ALVEO CONTAINERIZED APPLICATIONS

KUBERNETES ORCHESTRATION

Page 14: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Speech Translation

https://medium.com/syncedreview /tsinghua-university-publishes-comprehens ive-machine-translation-reading-list-c3f2df594218 14

High Throughput and Low Latency Inference Acceleration

High Throughput & Low-Latency Inference

Performance Unachievable by CPUs & GPUs

Estimated data: Alveo U50 (B=2, L=8), Tesla T4 (B=8, L=8)

1x

10x

0 2 4 6 8 10

GPU

Alveo U50

Page 15: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Database AnalyticsHigh Throughput Query Acceleration

Higher Query throughput and simplified

infrastructure

15

1x

4x

9x

13x

0 2 4 6 8 10 12 14

CPU In-Memory

1x Alveo U50

2x Alveo U50

3x Alveo U50

INTEL® XEON® PLATINUM 8260 PROCESSOR (35.75M Cache, 2.40 GHz) 24 coreCPU Query time = 210ms, 34k query/hr. Alveo U50=24ms, 150k query/hr

Page 16: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Financial Market Modeling Hyper Efficient Derivative Pricing and Risk Modeling

Faster, more efficient time to insight at a

fraction of the cost

16

1x

3x

20x

0 5 10 15 20

CPU

GPU

Alveo U50

Intel Xeon E5-2697 v4GCC 5.4.0, Nvidia Tesla V100 16GB PCIe CUDA 10.1 / GCC 5.4.0, Xilinx Alveo U50 SDAccel 2018.3

Page 17: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Electronic Trade OperationsUltra low-latency Networking and Compute Acceleration

Under 500ns latency with deterministic

performance

17Source: Xilinx Analysis

1x

20x

0 5 10 15 20

CPU

U50

T2T latency is <0.5usec. Measured from Start of Packet in on Tick (Market Data) to Start of Packet Out on the Order

Page 18: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Computational StorageLine-rate Data Compression Acceleration

Compression, decompression, erasure coding,

encryption all accelerated on one platform

18Source: Xilinx Analysis

1x

20x

0 5 10 15 20

CPU

Alveo U50

Intel Skylake-SP 6152 @2.10GHz 22-core CPU (Ubuntu 16.04), GB/s compression per CPU core = .0229. Alveo U50 = 10GB/s

Page 19: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Hadoop AccelerationLine-rate Data Compression Acceleration

19

Alveo U50 Acceleration

2x Less Nodes

40% Lower Total Cost

20x Throughput Per Node

Intel Skylake-SP 6152 @2.10GHz CPU (Ubuntu 16.04), GB/s compression per CPU core = .0229. Alveo U50 = 10GB/s, Assume 2:1 compression

192TB, 1GB/sec Per Node Compression Throughput

2x Dual CPU Servers96TB (192TB effective), 20GB/sec Per

Node Compression Throughput

Alveo Server with 2x Alveo U50

Page 20: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Key Takeaways

20

• 10-20x improvements in throughput, latency and power efficiency

• First PCIe Gen 4 support with HBM2 & 100Gbps network ports

• Growing ecosystem of software partners & accelerated solutions

• Enhanced developer tools & deployment stacks for scale

Expanded Market Opportunity

• First adaptable accelerator for compute, networking & storage -

built for any server, any cloud

Performance & TCO Advantages

Simplified Programming & Deployment

Page 21: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

APPENDIX

Page 22: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Xilinx Alveo Product Lineup

22

1,304k LUTs

Dual slot, full height

8GB HBM2, 460GB/sec

2x QSFP 28 (100GbE)

PCIe Gen3, Gen4, CCIX

< 225W

1,182k LUTs

Dual slot, full height

64GB DDR, 77GB/sec

2x QSFP 28 (100GbE)

PCIe Gen3

< 225W

1,728k LUTs

Dual slot, full height

64GB DDR, 77GB/sec

2x QSFP 28 (100GbE)

PCIe Gen3

< 225W

U280U250U200

UltraScale+ ArchitectureUltraScale+ Architecture UltraScale+ Architecture

U50

872k LUTs

Single slot, half height

8GB HBM2, 460GB/sec

1x QSFP 28 (100GbE)

PCIe Gen3, Gen4, CCIX

< 75W

UltraScale+ Architecture

Page 23: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx23

CPU(Sequential)

GPU(Parallel)

Alveo(Sequential + Parallel)

One Platform. Broadest Acceleration

High Level Coding

Complex Memory & Datapath

Adaptable Hardware

AI Inference + Pre/Post Process

On-board Networking

3rd Party Applications

Page 24: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Live Video Transcoding

24

5x Alveo U50NGCodec HEVC Very-High Quality

20x 1080p30

One Alveo U50 Server

Alveo U50 HEVC Video Compression

23x Lower Power Cost

8x Lower HW Cost

40x Xeon GoldH.265 very-high quality

20x 1080p30

20x Dual CPU Servers

20x Throughput Per Node

Simplified and Lower Cost Infrastructure

Page 25: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Fabric Attached Computational Storage

25

Bringing acceleration to NVMeoF Technology Preview:Fabric Attached

Computational Storage Array

˃ Fabric connected accelerator fronts SSDs – brings the compute to the data.

˃ NVMeoF target offloaded to U50 supporting 2.5 Million IOPS.

˃ Only 1 microsecond added latency to programable inline storage

accelerators.

Ethernet

SSD

SSD

SSD

SSD

SSD

In

terf

ace

NVMe SSD

NVMe SSD

NVMe SSD

PCIe Switch

PC

Ie In

terf

ace

BMC

PCIe Switch

RNICRNIC

Target

IP

(ROCEV

2)

NVMf IP

PCIe

PCIe

Management (Open Stack

API)

InlineAccelerators

Inline Accelerator Examples:Storage services:• (De)Compression• (De)Encryption• Data protectionDatabase Acceleration:• Scan• Filter• Aggregate

Page 26: Xilinx Alveo Portfolio Expansion€¦ · Announcing Xilinx Alveo U50 - Industry’s First Adaptable Compute, Networking, Storage Accelerator built for Any Server, Any Cloud Broad

.

@ Copyright 2019 Xilinx

Alveo Lineup – Detailed Specs

26

Product Name Alveo U200 Alveo U250 Alveo U280 Alveo U50

Dim

en

sio

ns

Width Dual Slot Dual Slot Dual Slot Single Slot

Form Factor, PassiveForm Factor, Active

Full Height, ¾ LengthFull Height, Full Length

Full Height, ¾ LengthFull Height, Full Length

Full Height, ¾ LengthFull Height, Full Length

Half Height, ½ LengthLo

gic

1

Look-Up Tables 1,182K 1,728K 1,304K 872K

Registers 2,364K 3,456K 2,607K 1,743K

DSP Slices 6,840 12,288 9,024 5,952

DR

AM

Me

mo

ry

DDR Format 4x 16GB 72b DIMM DDR4 4x 16GB 72b DIMM DDR4 2x 16GB 72b DIMM DDR4 –

DDR Total Capacity 64GB 64GB 32GB –

DDR Max Data Rate 2400MT/s 2400MT/s 2400MT/s –

DDR Total Bandwidth 77GB/s 77GB/s 38GB/s –

HBM2 Total Capacity – – 8GB 8GB

HBM2 Total Bandwidth – – 460GB/s 460GB/s

Inte

rna

l SR

AM Total Capacity 43MB 57MB 43MB 28MB

Total Bandwidth 37TB/s 47TB/s 35TB/s 24TB/s

Inte

rfa

ces PCI Express® Gen3 x16 Gen3 x16 Gen3 x16, 2x Gen4 x8, CCIX Gen3 x16, 2x Gen4 x8, CCIX

Network Interface 2x QSFP28 2x QSFP28 2x QSFP28U502 - 1x QSFP28

U50DD3 - 2x SFP-DD

Po

we

r a

nd

Th

erm

al Thermal Cooling Passive, Active Passive, Active Passive, Active Passive

Typical Power 100W 110W 100W 50W

Maximum Power 225W 225W 225W 75W

Tim

e

Sta

mp

Clock Precision – – – IEEE Std 1588

NEW