GTC China 2016

GTC 2016 — China

THE DEEP LEARNING AI REVOLUTION

2

GPU DEEP LEARNING BIG BANG

Deep Learning NVIDIA GPU

NIPS (2012)

ImageNet Classification with Deep ConvolutionalNeural Networks

Alex KrizhevskyUniversity of Toronto

Ilya SutskeverUniversity of Toronto

Geoffrey e. HintonUniversity of Toronto

3

74%

96%

2010 2011 2012 2013 2014 2015

DL

GPU DEEP LEARNING ACHIEVES “SUPERHUMAN” RESULTS

2012: Deep Learning researchersworldwide discover GPUs

2015: DNN achieves superhuman image recognition

2015: Deep Speech 2 achieves superhuman voice recognition

ImageNet — Accuracy %

Human

Hand-coded CV

Microsoft, Google

3.5% error rate

4

NVIDIA — “THE AI COMPUTING COMPANY”

GPU Computing Computer Graphics Artificial Intelligence

5

ANNOUNCING NEW GRAPHICS SDKS

Funhouse VROpen Source

360 Video 1.0Real-Time Panoramic VR

Iray VRPhotorealistic VR Ray Tracing

GVDBSparse Volumes for

Special Effects

Remote RenderingVideo Compositing

AnselIn-game Photography

VolumetricPhysical Light Models

OptiX 4.0Multi-GPU Ray-Tracing

MDL 1.0Physically Based Materials

Mental RayNow GPU-Accelerated!

7

NVIDIA VR FUNHOUSE

8

NVIDIA SILICON VALLEY HEADQUARTERS

9

GTC — 25X GROWTH IN GPU DL DEVELOPERS

4X Attendees 3X GPU Developers 25x Deep Learning Developers

2014

55,000400,00016,000

2,200120,000

3,700

• Australia• China• Europe• India

• Japan• Korea• United States

(Silicon Valley, D.C.)

20162014 2016

• Japan• United States

• Higher Ed 35%• Software 19%• Internet 15%• Auto 10%

• Government 5%• Medical 4%• Finance 4%• Manufacturing 4%

2014 2016

10

WHY DID AI RESEARCHERS ADOPT GPUs FOR DEEP LEARNING?

11

BRAIN IS LIKE A GPU

BRAIN CREATES MENTAL IMAGES WHEN WE THINK

12

GPU IS LIKE A BRAIN

13

GPU DEEP LEARNING IS A NEW COMPUTING MODEL

Training

Device

Datacenter

14


Training

Device

Datacenter

TRAINING

Billions of Trillions of Operations

GPU train larger models,accelerate time to market

15


Training

Device

Datacenter

DATACENTER INFERENCING

10s of billions of image, voice, video queries per day

GPU inference for fast response, maximize datacenter throughput

16


Training

Device

Datacenter

DEVICE INFERENCING

Billions of intelligent devices

GPU for real-time accurate response

17

AI — THE ULTIMATE COMPUTING CHALLENGE

IMAGE RECOGNITION SPEECH RECOGNITION

Important Property of Neural Networks

Results get better with

more data +bigger models +

more computation

(Better algorithms, new insights and improved techniques always help, too!)

2012AlexNet

2015ResNet

152 layers

22.6 GFLOP/image

~3.5% error8 layers

1.4 GFLOP/image

~16% Error

16XModel

2014Deep Speech 1

2015Deep Speech 2

2 ExaFLOPS

25M | 7,000 Hours

~8% Error

10XTraining Ops

20 ExaFLOPS

100M | 12,000 Hours

~5% Error

18

PASCAL “5 MIRACLES” BOOST DEEP LEARNING 65X

Pascal — 5 Miracles NVIDIA DGX-1 Supercomputer 65X in 4 yrs Accelerate Every Framework

PaddlePaddleBaidu Deep Learning

Pascal

16nm FinFET

CoWoS HBM2

NVLink

cuDNN

Chart: Relative speed-up of images/sec vs K40 in 2013. AlexNet training throughput based on 20 iterations. CPU: 1x E5-2680v3 12 Core 2.5GHz. 128GB System Memory, Ubuntu 14.04. M40 datapoint: 8x M40 GPUs in a node P100: 8x P100 NVLink-enabled.

Kepler

Maxwell

Pascal

X

10X

20X

30X

40X

50X

60X

70X

2013 2014 2015 2016

19

ANNOUNCINGNEW IBM SERVERPOWER8 + NVIDIA TESLA P100 FOR THE AI ENTERPRISE

“ Putting NVIDIA’s technology into the IBM system will speed

up performance for such emerging workloads as AI, deep

learning and data analytics.” — eWeek

20

Andrew Ng, Chief Scientist

21

Training

Device

Datacenter

22

ANNOUNCINGTESLA P4 & P40 INFERENCING ACCELERATORS

Pascal Architecture | INT8

P40: 250W | 40X Energy Efficient versus CPU

P40: 250W | 40X Performance versus CPU

23

ANNOUNCINGTensorRTPERFORMANCE OPTIMIZING INFERENCING ENGINE

FP32, FP16, INT8 | Vertical & Horizontal Fusion | Auto-Tuning

VGG, GoogLeNet, ResNet, AlexNet & Custom Layers

Available Today: developer.nvidia.com/tensorrt

26

NVIDIA GPUDEEP LEARNING EVERYWHERE

Alibaba/Aliyun

iQIYI

Shazam

Amazon

JD.com

Skype

Facebook

Orange

Twitter

Flickr

Periscope

Yahoo Supermarket

Google

Pinterest

Yandex

iFLYTEK

Qihoo 360

Yelp

eBay

Tencent

Netflix

Baidu

Sogou

Microsoft

27

>1,500 AI STARTUPS AROUND THE WORLD

Deep Learning for Cybersecurity

Deep Learning for Genomics

Deep Learning for Self-Driving Cars

Deep Learning for Art

28

AI STARTUPS IN CHINA

Weather & Environment Forecast

Eye-tracking for Human-machine Interaction

MedicalImaging

Face Recognition

Product Recognition, Detection, Search

Personal Concierge App

29

Training

Device

Datacenter

30

“BILLIONS OF INTELLIGENT DEVICES”

“Billions of intelligent devices will take advantage of DNNs to provide personalization and localization as GPUs become faster and faster over the next several years.”

— Tractica

31

AI CITY — 1B CAMERAS BY 2020

~1 billion cameras worldwide by 2020

30 billion inferences/sec

Tesla P40: 2,500 inferences/sec @ 720P

AI City needs ~10M P40 servers

DATA: 1B cameras, IHS “Video Surveillance Intelligence Service, Aug. 2016”

32

1/20TH THE SPACE, 1/10TH THE POWER

Hikvision Blade16 Jetson TX1s

NVIDIA DGX-1 Traditional Server Hikvision Blade

~21 1U Servers42 CPUs~4,000 W

1 Hikvision Blade16 TX1 + 1 CPU>8 1080 streams

~300 W

33

ANNOUNCING NVIDIA AI CITY PARTNERS

34

AI TRANSPORTATION — $10T INDUSTRY

PERCEPTION AI PERCEPTION AI LOCALIZATION DRIVING AI

DEEP LEARNING

35

FREE SPACE DETECTION CAR 3D DETECTION

36

NVIDIA BB8 AI CAR

37

NVIDIA DRIVE PX 2AutoCruise to Full Autonomy — One Architecture

Full Autonomy

AutoChauffeur

AutoCruise

AUTONOMOUS DRIVINGPerception, Reasoning, Driving

AI Supercomputing, AI Algorithms, Software

Scalable Architecture

38

NVIDIA DRIVE PX 2 AUTOCRUISE

10W AI Car Computer | Passive Cooling | Automotive IO

AI Highway Driving | Localization & Mapping

39

NVIDIA & BAIDUPARTNER ON AI SELF-DRIVING CARS

40

NVIDIA AI SELF-DRIVING CARS IN DEVELOPMENT

Baidu nuTonomy Volvo WEpods NVIDIA

41

NVIDIA END-TO-END DEEP LEARNING PLATFORM

TRAINING


DGX-1TESLA P100

42


TRAINING


DGX-1TESLA P100


ANNOUNCING TESLA P4 & P40

ANNOUNCINGTensorRT

43


TRAINING


DGX-1TESLA P100


ANNOUNCING TESLA P4 & P40

ANNOUNCINGTensorRT

CUDA

JETPACK DRIVEWORKS

JETSON TX1ANNOUNCING

DRIVE PX 2 AUTOCRUISE

INTELLIGENT DEVICES

44

NVIDIA DEEP LEARNING PLATFORM PARTNERS

AI ENTERPRISE AI CITY AI CAR

45

AI FOR EVERYONE

AI will Revolutionize Transportation AI will Revolutionize Healthcare AI will Revolutionize Society

GTC China 2016

Technology

Transcript of GTC China 2016