Explaining Deep Neural Networks with a Polynomial Time ...13-09-00)-13-09-25... · Explaining Deep...

Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation

Marco Ancona, Cengiz Öztireli2, Markus Gross1,2

1Department of Computer Science, ETH Zurich, Switzerland2Disney Research, Zurich, Switzerland

Attribution method

Pre-trainedmodel

TARGET

Layer-wise Relevance Propagation (LRP)Bach et al. 2015

DeepLIFTShrikumar et al. 2017

Saliency MapsSimonyan et al. 2015

Integrated GradientsSundararajan et al. 2017

Grad-CAMSelvaraju et al. 2016

Simple occlusionZeiler et al. 2014

LIMERibeiro et al. 2016

GuidedBackpropagationSpringenberg et al. 2014

Prediction DifferenceAnalysisZintgraf et al. 2017

Meaningful PerturbationFong et al. 2017

Gradient * InputShrikumar et al. 2016

KernelSHAP/DeepSHAPLundberg et al., 2017

Attribution methods

Evaluating attribution methods

• No ground-truth explanation à not easy to evaluate empirically

• No ground-truth explanation à not easy to evaluate empirically• Often based on heuristics à not easy to justify theoretically

“Axiomatic approach”From a set of desired properties to the method definition

(Some) desirable propertiesCompleteness

ContinuityAttributions for two nearly identical inputs on a continuous function should be nearly identical.

LinearityAttributions generated for a linear combination of two models should also be a linear combination of the original attributions.

Symmetry

Attributions should sum up to the output of the function being considered, for comprehensive accounting.

If two features have exactly the same role in the model, they should receive the same attribution.

Shapley Values Shapley,LloydS.,1953

The only attribution method that satisfies all the aforementioned properties.

Shapley Values

The function to analyze(eg. the map from the input layer to a specific output

neuron in a DNN)

Shapley,LloydS.,1953

Shapley ValuesS is a given set of input features

Shapley Values

All unique subsets S of features taken from the input (set) P

Shapley Values

All unique subsets S of features taken from the input (set) P

Shapley Values

average

Shapley Values

marginal contributionaverage

Shapley Values

marginal contributionaverageall subsets

Shapley Values

marginal contributionaverageall subsets

“The average marginal contribution of a feature with respect to all subsets of other features”

Shapley Values

Issue: testing all subsets is unfeasible!

Shapley value sampling Castroetal.,2009

0.16 0.10

0.16 0.10 0.25

0.16 0.10 0.25 -0.35

Pros: Shapley value sampling is unbiased

Cons: might require a lot of samples (network evaluations) to produce an accurate result

Pros: Shapley value sampling is unbiased

Cons: might require a lot of samples (network evaluations) to produce an accurate result

Can we avoid sampling?

Shapley value sampling

Deep Approximate Shapley Propagation

k out of NFeatures on

“Rectified” Normal Distribution

Deep Approximate Shapley PropagationTo propagate distributions through the network layers we use Lightweight Probabilistic Deep Networks Gast etal.,2018

Deep Approximate Shapley PropagationTo propagate distributions through the network layers we use Lightweight Probabilistic Deep Networks Affine transformation

Rectified Linear Unit

Leaky Rectified Linear Unit

Mean pooling

Max pooling

Gast etal.,2018

The use of other probabilistic frameworks is also possible

DASP vs other methods

Gradient-based methods ü (Very) fast✗ Poor Shapley Value estimation

Sampling-based methods ü Unbiased Shapley Value estimator✗ Slow

For details, come at the posterPacific Ballroom #63

Thank you

Lightweight Probabilistic Deep Network (Keras)github.com/marcoancona/LPDN

Deep Approximate Shapley Propagationgithub.com/marcoancona/DASP

ReferencesLloyd S. Shapley, A value for n-person games, 1952Castro et al., Polynomial calculation of the Shapley value based on sampling, 2009Fatima et al., A linear approximation method for the Shapley value, 2014Ribeiro et al., "Why Should I Trust You?": Explaining the Predictions of Any Classifier, 2016Sundararajan et al., Axiomatic attribution for deep networks, 2017Shrikumar at al., Learning important features through propagating activation differences, 2017 Lundberg et al., A Unified Approach to Interpreting Model Predictions, 2017Gast et al., Lightweight Probabilistic Deep Networks, 2018

Explaining Deep Neural Networks with a Polynomial Time ...13-09-00)-13-09-25... · Explaining Deep...

Documents

Transcript of Explaining Deep Neural Networks with a Polynomial Time ...13-09-00)-13-09-25... · Explaining Deep...

ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

Interpreting and Explaining Deep Models in Computer Visioniphome.hhi.de/samek/pdf/VISMACSummerSchool2018.pdf · Artificial Neural Networks and Machine Learning – ICANN 2016, Part

Explaining Deep Neural Networks · Explaining Deep Neural Networks Oana-Maria Camburu Linacre College University of Oxford A thesis submitted for the degree of Doctor of Philosophy

2011-The design of polynomial function-based neural network.pdf

Section 8: Polynomial Functionsteachers.dadeschools.net/sdaniel/Algebra Nation Polynomial Graphs.pdfSection 8: Polynomial Functions 2 ! Section 8: Polynomial Functions The following

11 Polynomial and Piecewise Polynomial Interpolationreichel/courses/intr.num.comp.2/spring12/... · 11 Polynomial and Piecewise Polynomial Interpolation ... but the trigonometric

Neural responses to functional and experiential ad appeals ... · Neural responses to functional and experiential ad appeals: Explaining ad effectiveness Linda E. Couwenberga,⁎,

Explaining Neural Matrix Factorization with Gradient Rollback

Interpreting and Explaining Deep Neural Networks: A ...

Chapter 7: Polynomial Functions 7.1 Polynomial Functions Objectives: 1) Evaluate polynomial functions 2)Identify general shapes of graphs of polynomial.

Visualizing and explaining neural networksslazebni.cs.illinois.edu/fall18/lec11_visualization.pdf · What’s happening inside the black box? “cat” Not too helpful for subsequent

Support Vector Machines - MIT OpenCourseWare · • Artificial neural networks: 0.826 • Support vector machines (polynomial kernel): 0.738 to 0.813 • Support vector machines (Gaussian

Explaining Neural Networks Semantically and Quantitativelyopenaccess.thecvf.com/content_ICCV_2019/papers/... · Explaining Neural Networks Semantically and Quantitatively ... ed rough

Lesson 3.1 Defining Polynomial Functionsholyspiritmath3200.weebly.com/.../6/9/...functions.pdf · Lesson 3.1 Defining Polynomial Functions 3 Classifying Polynomial Functions Polynomial

Explaining Deep Neural Networks - arXiv

Polynomial Regression as an Alternative to Neural NetsPolynomial Regression as an Alternative to Neural Nets XiCheng DepartmentofComputerScience UniversityofCalifornia,Davis Davis,CA95616,USA

Explaining Distributed Neural Activations via Unsupervised ...openaccess.thecvf.com/content_cvpr_2017_workshops/... · Explaining Distributed Neural Activations via Unsupervised Learning

Introduction to Statistical Learning - Cross Entropycross-entropy.net/ML210/Introduction.pdf · regression, polynomial regression, logistic regression, neural network •Non-Parametric

Explaining Neural Network Predictions on Sentence Pairs ...

Adversarial Examples and Adversarial Training · (Goodfellow 2016) In this presentation • “Intriguing Properties of Neural Networks” Szegedy et al, 2013 • “Explaining and