Past, Present & Future - GTC On-Demand Featured Talks...

29
S7204 Past, Present & Future: AI & HPC Infrastructure in Azure @Karan_Batta Senior Program Manager Microsoft Azure Compute

Transcript of Past, Present & Future - GTC On-Demand Featured Talks...

Page 1: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

S7204

Past, Present & Future:AI & HPC Infrastructure in Azure

@Karan_Batta

Senior Program Manager

Microsoft Azure Compute

Page 2: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Our Mission

“No compromise infrastructure”

Invest in scale out; hyper-scale workloads need low latency and high bandwidth networking

Close to bare-metal performance

Invest in eco-system of partners

True “HPC in the cloud”

Page 3: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Recap

Page 4: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Compute Virtual Machines (NC)

NC6 NC12 NC24 NC24r

Cores 6 12 24 24

GPU1 K80 GPU (1/2

Physical Card)

2 K80 GPUs (1

Physical Card)

4 K80 GPUs (2

Physical Cards)

4 K80 GPUs (2

Physical Cards)

Memory 56 GB 112 GB 224 GB 224 GB

Disk ~380 GB SSD ~680 GB SSD ~1.5 TB SSD ~1.5 TB SSD

Network Azure Network Azure Network Azure Network InfiniBand

Page 5: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

State of the Union

5000+ customer signups during preview

General Availability since December 1st

Huge demand for specialized hardware

GPU offerings at the forefront of hardware innovation

New 1st party products built on N-Series like Cris.ai

100s of external customers in production

Areas such as AI & Deep Learning driving growth

Page 6: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in
Page 7: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Under The Covers

Applications

GPU Provisioning

Host OS

Client OS

Hardware

• Azure Developer & Platform Services

• Custom Images

• Azure Marketplace

• Custom apps and services

• Hyper-V

• DDA

• NVIDIA M60 GPU (Viz SKU)

• NVIDIA K80 GPU (Compute SKU)

Page 8: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

DDA? (Discreet Device Assignment)

Page 9: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in
Page 10: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Real World Case Studies

Page 11: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

“By using GPU resources in Azure, we can

run simulations in days that would take a

month on CPU-based machines. This

speeds our progress toward the

development of lifesaving drugs.”

Dr. Nagarajan Vaidehi

Director

Computational Therapeutics Core

Beckman Research Institute

“We are not short on ideas,

just computers.”

City Of Hope

Page 12: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

AudioBurst

Page 13: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Next-Gen Compute Virtual Machines (NC_v2)

NC6s_v2 NC12s_v2 NC24s_v2 NC24rs_v2

Cores 6 12 24 24

GPU 1 x P100 2 x P100 4 x P100 4 x P100

Memory 112 GB 224 GB 448 GB 448 GB

Disk ~700 GB SSD ~1.4 TB SSD ~3 TB SSD ~3 TB SSD

Network Azure Network Azure Network Azure Network InfiniBand

Page 14: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

HPC Workloads Performance Gains with P100

0x

10x

20x

30x

40x

2x K80 4x P100 16GB

Speedup Relative to

Dual Broadwell

Broadwell CPU System: Dual E5-2690v3@ 2.6GHz, 14 CoreGPU System: Same CPU system with 2x K80 and 4x P100 PCIe with 16GB

Page 15: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Artificial Intelligence

Page 16: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in
Page 17: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Seeing AI

Page 18: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Skype Translator

Page 19: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

NOONUM

Page 20: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Algorithmia

Page 21: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Smart Refrigerator

Page 22: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in
Page 23: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

The system's word error rate is reported to be 5.9 percent, which is "about equal" to professional transcriptionists asked to work on speech

Page 24: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Cognitive Toolkit fastest on Azure & Pascal GPUs

Page 25: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Deep Learning Virtual Machines (ND)

ND6s ND12s ND24s ND24rs

Cores 6 12 24 24

GPU 1 x P40 2 x P40 4 x P40 4 x P40

Memory 112 GB 224 GB 448 GB 448 GB

Disk ~700 GB SSD ~1.4 TB SSD ~3 TB SSD ~3 TB SSD

Network Azure Network Azure Network Azure Network InfiniBand

Page 26: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Training Workloads Performance Gains with P40

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

AlexnetOWT Googlenet InceptionV3 ResNet-50 VGG16 AlexnetOWT ResNet-152 ResNet-50

4x K80 4x P40

Speed-Up ranging to over 2x for training workloads

CNTKCaffe

Page 27: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Up to 21x Inference Throughput with P40

-

1,000

2,000

3,000

4,000

1 2 4 8 16 32 64 128

Thro

ugh

pu

t (i

mag

es/s

eco

nd

)

Batch Size

K80

P40

21x Speedup

GPU: Ubuntu 14.04.5, Tensor RT 2.1, CUDA 8.0.42, cuDNN 6.0.5; precision FP32 (K80), INT8 (P40 GPU).

ResNet-50

Optimize performance with TensorRT and reduced precision

Page 28: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

NVIDIA Tesla P40 Demo

Page 29: Past, Present & Future - GTC On-Demand Featured Talks ...on-demand.gputechconf.com/gtc/2017/presentation/s7204-karan-batta... · Past, Present & Future: AI & HPC Infrastructure in

Follow me @Karan_Batta

Thanks!