Photonics for AI

25
Photonics for AI Dr. Yichen Shen Presentation @ MIT Apr 26 th , 201 8 1

Transcript of Photonics for AI

Photonics for AI

Dr. Yichen Shen

Presentation @ MIT

Apr 26th, 2018

1

NanophotonicsOptical structures (dielectrics) with nanometer-scale features

Photonic crystals2-D

periodic intwo directions

3-D

periodic inthree directions

1-D

periodic inone direction

Metamaterials

Thanks to improved computation power & fabrication techniques

Nanophotonic structures

(Feature size <≈ Wavelength)

wavelength of visible light

m mm mm nmFeature size

Ray optics

Macroscopic structures

(Feature size >> Wavelength)

Limited ways to

manipulate light

2

500 nmIntegrated Photonics

Artificial Neural Networks (ANN)

4/26/20183Deep Learning with Coherent

Nanophotonic Circuits

Breakthroughs in deep

learning: • Natural Language Processing

(NLP)

• Game Playing (Go, Atari)

• Autonomous Vehicles

• Control

• Ad Placement

• Researches (drug discovery,

material study)

• Etc.

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits4

Neuromorphic Computing

Biological Neural Networks Artificial Neural Networks

Basic Algorithm of ANN

Deep Learning with Coherent Nanophotonic

Circuits5 4/26/2018

Matrix Multiplication:

Nonlinear Activation:

… …

1

2

9

Hardware and Data Enable Deep Learning

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits6

The Need for Speed

More Data Bigger Models More need for Computation

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits7

But Moore’s Law is no-longer providing more computation…

The Market:

On clouds:

Millions of high power AI

processors ($10,000 each)

in data centers by 2020

On premise:

Billions of compact AI

processors needed due

to the rise of

autonomouse driving,

AR and IoT.Von Neumann ASIC/FPG

A

Optical

Processing

In Deep Learning

Key Operation is dense M x V

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits8

aj

bi Wij= x

In Optics, Matrix Multiplication

is very common & (usually)

consumes no energy !

Convolution / FFT

Matrix

Multiplication

ANN does NOT require high resolution

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits9

Sze et al, arXiv:1703.09039 (2017)

Deep Learning Inference is

“Passive”

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits10

Once the Optical Neural Network is trained, no need to

update the weights frequently…

Deep Learning is very

parallelizable

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits11

Multiple wavelengths can be used to simultaneously

execute batch of data

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits12

Coherent Optical Neural Networks (ONN)

X1

X2

X3

X4

h1(1)

h2(1)

h3(1)

h4(1)

Z (1) = W0X

h1(i)

h2(i)

h3(i)

h4(i)

h1(n)

h2(n)

h3(n)

h4(n)

Y1

Y2

Y3

Y4

h( i ) = f (Z ( i ) ) Y = Wn h(n )

Input LayerHidden Layers

Output Layer

Layer i Layer n

Optical Input Optical Output

Layer 1X Y

a

b

c

x i n xou t

Waveguide

0

0

Optical Interference Unit Optical Nonlinearity Unit

0

0

d

S(n)V (n) U (n)

Vowel X

M (n)M (1) = U (1)S(1)V (1) M (2) M (n− 1)

fNL()fNL()

Photonic Integrated Circuit

fNL

Programmable Nanophotonic Processors

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits13

100µm

S U ( 4 ) C o r e D M M C

a

b

c

Phase Shifter

Waveguide

Transm

issio

n

Voltage2

O p t i c a l I n t e r f e r e n c e U n i t ( O I U )

MZI

Detectors

Input Modes

φi✓i

Y.Shen and N. Harris et al. “Deep Learning with Coherent Nanophotonic Circuit” Nature Photonics 11, 441–446 (2017)

4/26/2018Deep Learning with Coherent Nanophotonic Circuits 14

da

bLaser OIU Detectors Computer

U1S1V1

Tra

nsm

issio

nOIU1

OIU2 CPU OIU

3OIU

4

U1S1V1fSA ( Iin)

Input Output

Instance

Insta

nce

60µm

Optical Interference Unit

SU(4) Core DMMC

Dq Df

c

V2Dq

18µm

Directional Coupler

Phase ShifterSimulated Optical

Nonlinearity

Y.Shen and N. Harris et al. “Deep Learning with Coherent Nanophotonic Circuit” Nature Photonics 11, 441–446 (2017)

Optical Vowel Recognition

(4d 4 classes)

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits15

Experimental Result

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits16

Simulation Result: 165/180=91.7%

Experiment Result: 138/180=76.7%

Vowel IdentifiedVowel Identified

Vo

wel

Sp

oken

Vo

wel

Sp

oken

The other side of the Story…

Immature photonics eco-system (low yield, high cost)

Large device size

Non-ideal PDK component design (lossy, low resolution,

power hungry)

AD/DA interface

17

Software & HardwareAI algorithms DESIGNED to be run on photonics chip

18

L. Jing & Y. Shen et al, International Conference for Machine Learning (ICML

2017)

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits19

Fully Connected Neural Networks

Recurrent Neural Networks Convolutional Neural Networks

Recurrent Neural Networks

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits20

Commonly used for Speech Recognition and Language Processing

Convolution Neural Networks

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits21

Scott Skirlo and Yichen Shen et al, Manuscript in Preparation

Speed and Energy Efficiency Comparison with Electrical ANN

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits22

NVIDIA TITAN X ONN

(with thermal PS)

Architecture Von Neumann Neuromorphic

Power Consumption 1 kW 1-2 kW

Operation Speed 10 TFLOP 10,000 TFLOP

Y. Shen and N. Harris et al, Nature Photonics 11, 441 (2017)

4/26/2018Deep Learning with Coherent Nanophotonic Circuits 23

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits24

Optical AI Computing

Some History on Optical Neural

Networks

4/26/2018Deep Learning with Coherent

Nanophotonic Circuits25