Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding...

42
German Research Center for Artificial Intelligence (DFKI) ALL RIGHTS RESERVED. No part of this work may be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system without expressed written permission from the authors. Understanding Deep Networks through Properties of the Input Space GTC 2019 By: Sebastian Palacio 1

Transcript of Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding...

Page 1: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

German Research Center for Artificial Intelligence (DFKI)

ALL RIGHTS RESERVED. No part of this work may be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system without expressed written permission from the authors.

Understanding Deep Networks through

Properties of the Input Space

GTC 2019

By: Sebastian Palacio

1

Page 2: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

NeuralNetwork

Deep Neural Networks WorkDUH!

2

Page 3: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

NeuralNetwork

NeuralNetwork

...yet they can be easily tricked

3

Page 4: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

NeuralNetwork

Filter

Harden

Flag

Safeguarding becomesa “thing”

4

Page 5: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

NeuralNetwork

Filter

Harden

Flag

Cat and Mouse Chase5

Page 6: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Modify the Network

How do Attacks Work?input

features features features output

6

Page 7: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Modify the Network

How do Attacks Work?input

features features features output

Modify the Input7

Page 8: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

How do Attacks Work?input

features features features output

Modify the Input8

Page 9: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

How do Attacks Work?input

features features features

9

Pass input through the network: f(x)1.

Page 10: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

How do Attacks Work?input

features features features

10

Pass input through the network: f(x)

Compute sensitivity: f’(x)

1.

2.

Page 11: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

How do Attacks Work?input

features features features output

Modify the Input11

Pass input through the network: f(x)

Compute sensitivity: f’(x)

Modify input according to sensitivity.

1.

2.

3.

Page 12: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Gradients are good estimators of the input’s space distribution

12

INPUT gradient

Perturbation

Page 13: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

1. Reconstruction:How do Attacks Work?

13

Page 14: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

2. Classification:How do Attacks Work?

14

Page 15: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Idea against attacks!

Give me Gradients!

15

Page 16: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Reconstruction Gradients

Classification Gradients

AVOID THIS16

Page 17: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Hypothesis: bigger problems are better

Reconstruction Gradients

Classification GradientsMNIST

ImageNet17

Page 18: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

18

YFCC100mSegNet +

ImageNet

69x

...so we tried

Page 19: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

19

Perceptually similar!

Page 20: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

How to Compare:

20

ResNet-50SegNet

Noise Level

Model Accuracy

Page 21: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Targeted Vs Untargeted Attacks:

0.3

0.5

0.2𝚫y

Untargeted:Push the true class down until any other wins.

Targeted:Push a randomly selected target up until it wins.

21

Page 22: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Quick, pick one at random!

22

Page 23: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Input Input

Adversarial <-> Non adversarialHYPOTHESIS

23

PerturbationInput Gradients

Input Gradients Perturbation

Page 24: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Adversaries fighting an attack-agnostic Autoencoder on Imagenet

Baseline (no attack)

Classifier only (no defense)

Classifier with Autoencoder

ALP for targeted PGD (Kannan et al. 2018)

ALP for untargeted PGD (Engstrom et al. 2018)

24

Simple attack

Loop with clipping

Amount of noise

Same but in a loop

Fancy optimization

Page 25: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Adversaries fighting an attack-agnostic Autoencoder on Imagenet

Baseline (no attack)

Classifier only (no defense)

Classifier with Autoencoder

ALP for targeted PGD (Kannan et al. 2018)

ALP for untargeted PGD (Engstrom et al. 2018)

25

Simple attack

Loop with clipping

Amount of noise

Same but in a loop

Fancy optimization

74.0271.19

No AE With AE

Page 26: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Structural Gradients Obstruct Gradient-Based Attacks*

26

Recon

struc

tion

Grad

ients

Class

ifica

tion

Grad

ients

*as long as structure is not tightly related to semantics

Page 27: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

A closer look at adversarial noise

MNIST

Reality:Non structural changes

Uninformative dimensions!

27

Expectation:Structural Change

Page 28: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Effects of extra dimensions

28

1D:

2D:

𝚫x

Page 29: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

29

From 2D... ...to 3D

Semantic information on x,y

z-axis is uninformative

Page 30: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

30

Decision Boundaries

Page 31: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

31

Expected Boundary● Z-axis does not interfere● Perturbations need to go in

the direction of the training samples

Page 32: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

32

Vulnerable Boundary● Small perturbations along the

“extra” dimension change the predicted class!

Page 33: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

33

Vulnerable Boundary

● Class boundary extends over the domain of other classes

Page 34: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

34

Extrapolating...

1D 2D 3D 784D... ...

Page 35: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Preserve only the information that is useful for classification

35

Page 36: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Step 3 Fine-tune the decoder with gradients from the classifier

train a classifier

Step 1

ImageNet

train an autoencoder

Step 2 YFCC100M

Palacio, Sebastian et al. "What do Deep Networks Like to See?." CVPR (2018) 36

Page 37: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

37

Accuracy on ResNet-5074.02

71.19

No AE With AE

74.9474.02

With Fine-tuned AE

-2.83pp +0.92pp

Palacio, Sebastian et al. "What do Deep Networks Like to See?." CVPR (2018)

Page 38: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

38

Looking up Reconstructions

Ori

gin

alR

esN

et50

R

eco

nst

ruct

ion

s

Page 39: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

Experiments with S2SNets (on Imagenet)39

Baseline (no attack)

Classifier only (no defense)

Classifier with Autoencoder

Classifier with S2SNet

● Consistent offset (projection of unnecessary input signal)

● Not bounded to any specific adversarial attack.

● Zero compromise for clean images (no attack)

74.94

With S2SNet

Page 40: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

40

So, did we solve adversarial attacks?

● Function is a proof of concept for a defense principle:○ Gradients are stable but convey information

that is less effective for adversarial attacks.○ No gradient obfuscation :)

● Content dependent.

● Still vulnerable under some specific but common threat conditions.

Page 41: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

41

Manifold exploration is possible through input gradients.They express different things depending on the task

If structural info != semantic info, autoencoders can help with adversarial attacks.

Projection of redundant dimensions can be achieved via S2SNets

High dimensionality of the input space induces (exploitable) irregularities for decision boundaries

Sum

mar

y

It’s a sound design principle against gradient-based attacks

Enhancing robustness against adversarial attacks!

Page 42: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1.

42

Thank you!

42

DeepLearning

Sebastian [email protected]@spalaciob

“Adversarial Defense using Structure-to-Signal Autoencoders”https://arxiv.org/abs/1803.07994

In collaboration with:● Joachim Folz (equal contribution)● Jörn Hees● Federico Raue

Supervisor:● Andreas Dengel

DFKI Kaiserslautern

Some images have been taken from www.pexels.com and www.openclipart.org