Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva...

21
@affectiva Facial expression and Emotion detection for mobile Jay Turcot Director of Applied AI Affectiva

Transcript of Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva...

Page 1: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Facial expression and

Emotion detection for

mobile

Jay Turcot

Director of Applied AI

Affectiva

Page 2: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

What if technology could identify emotions as humans can?

Page 3: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Our vision is to humanize technology with Emotion Intelligence by enabling

machines to be emotion-aware and by allowing businesses to get emotion

analytics

Page 4: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)
Page 5: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

How it works?

Page 6: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

How it works?

Face detection and

tracking

Facial action & attribute

classification

Facial expression

interpretation

Page 7: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Task: Facial expression recognition

• Multi-attribute classification (~20+ classes) • Upright, fixed-size, grayscale

• Fast enough to run on-device!

Brow furrow

Brow raise

Smile

Page 8: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Emotion AI platform built on deep learning

Sadness Joy Anger Surprise

Contempt Disgust Fear

Age Ethnicity Gender

Convolutional Neural Networks Output: Input:

11 Facial expressions

Gender

Labeled and unlabeled videos (+voice)

data. Meta data. Latest training used

1M+ images.

Page 9: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Training Setup

• NVIDIA Titan X (Pascal) • 12 GB, 3584 CUDA cores

• NVIDIA CUDA® Deep Neural Network • CUDA 8.0, cuDNN 5

FAST

• Keras + TensorFlow, Docker • TensorFlow 1.0, nvidia-docker SIMPL

E

Page 10: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

A few tips on training

Page 11: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Use all annotated data available!

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

Training Loss

Use every frame in a video as well as partially annotated data

0

0.1

0.2

0.3

0.4

0.5

0.6

Testing Loss

0.91

0.912

0.914

0.916

0.918

0.92

0.922

0.924

0.926

Accuracy (overall)

Sampled frames

All frames

All frames

(+ data w/ partial

annotation data)

Page 12: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Balancing data isn’t strictly required Classes with ~3 times more data

0.87

0.88

0.89

0.9

0.91

0.92

0.93

0.94

Gender Smile Brow raise Brow furrow

Balanced sampling Unique sampling

Balanced: 90.5% Natural (unbalanced):

90.8%

Page 13: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Building fast models

Page 14: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Speeding up deep learning models Several approaches are used for speeding up models

Model Pruning

Model Compression

Model Quantization

Page 15: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Match architecture to the problem Avoid network architecture that is larger than needed

Problem Object detection (& classification) Facial action & attribute

classification

Details 1000 classes

~224x224 pixels, color

Objects with arbitrary scales / positions /

orientations

20+ classes

~100x100 pixels, grayscale

Faces only, upright & registered

Architectures VGG’16 [1] - 16 layers (~30.9 GOP/image)

ResNet [2] - 152 layers (~22.6 GOP/image)

Others: Inception v4, E-Net

?

Page 16: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Lots of big filters are expensive! Use smaller filters to condense information

Page 17: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Look for redundancy in your layers Small filters are faster… but can be highly correlated

Page 18: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Small networks still work very well… … for simpler problems

7.14

21.97

15.96

0

5

10

15

20

25

CNN (small) CNN(medium)

ENet

MFLOPs

92.99% 93.09%

92.97%

92.00%

92.20%

92.40%

92.60%

92.80%

93.00%

93.20%

93.40%

93.60%

93.80%

94.00%

CNN (small) CNN(medium)

ENet

Accuracy

Page 19: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Result

• Smaller models (<10 MFLOPs) still

outperform traditional methods • Don’t just copy architectures like VGG (30+ GOPS)

• Explore network architectures that prioritize efficiency (E-Net)

• Other methods still apply to improve

runtime performance: • Quantize models to 8-bit fixed point or binary

• Prune models

• Models deployed in our on-device SDK: http://developer.affectiva.com/

Page 20: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Questions

Page 21: Emotion detection for - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s7691...@affectiva Task: Facial expression recognition • Multi-attribute classification (~20+ classes)

@affectiva

Questions