Interpretability in Machine LearningInfoGAN: Interpretable Representation Learning by Information...

Interpretability in Machine Learning

Tameem Adel

Machine Learning GroupUniversity of Cambridge, UK

February 22, 2018

February 22, 2018 1 / 29

Growth of ML

ML algorithms optimized:

Not only for task performance, e.g. accuracy.

But also other criteria, e.g. safety, fairness, providing the right toexplanation.

There are often trade-offs among these goals.

However,

Accuracy can be quantified.

Not precisely the case for the other criteria.

February 22, 2018 2 / 29

What is interpretability?

Interpret means to explain or to present in understandable terms.

In the ML context: The ability to explain or to present inunderstandable terms to humans.

What constitutes an explanation? What makes some explanationsbetter than others? How are explanations generated? When areexplanations sought?

Automatic ways to generate and, to some extent, evaluateinterpretability.

February 22, 2018 3 / 29

Taxonomy

Task-related:

Global interpretability: A general understanding of how the system isworking as a whole, and of the patterns present in the data.

Local interpretability: Providing an explanation of a particularprediction or decision.

Method-related (what are the basic units of the explanation?):

Raw features.

Derived features that have some semantic meaning to the expert.

Prototypes.

The nature of the data/tasks should match the type of the explanation.

February 22, 2018 4 / 29

Visualizing Deep Neural Network Decisions:Prediction Difference Analysis

Zintgraf, Cohen, Adel, Welling, ICLR 2017

5 / 29

Visualize the response of a deep neural network to a specific input.

For an individual classifier prediction, assign each feature a relevancevalue reflecting its contribution towards or against the predicted class.

6 / 29

Visualizing deep networks

Looking under the hood: explaining why a decision was made.

Can help to understand strengths and limitations of a model, help toimprove it [wolves/huskies based on presence/absence of snow].

Important for liability: why does the algorithm decide this patient hasAlzheimer?

Can lead to new insights and theories in poorly understood domains.

7 / 29

Approach

Relevance of a feature xi can be estimated by measuring how theprediction changes if the feature is unknown.

The difference between p(c |x) and p(c |x\i ), where x\i denotes the setof all input features except xi .

But how would a classifier recognize a feature as unknown?

Label the feature as unknown.Retrain the classifier with the feature left out.Marginalize the feature.

8 / 29

Marginalization of a feature

p(c |x\i ) =∑xi

p(xi |x\i )p(c |x\i , xi ) (1)

Assume xi is independent of x\i

p(c|x\i ) ≈∑xi

p(xi )p(c |x\i , xi ) (2)

9 / 29

Weight of evidence

Compare p(c |x\i ) to p(c |x):

odds(c |x) = p(c|x)(1−p(c|x))

WEi (c |x) = log2 (odds(c |x))− log2(odds(c |x\i )

), (3)

A large prediction difference → the feature contributed substantiallyto the classification.

A small prediction difference → the feature was not important for thedecision.

A positive value WEi → the feature has contributed evidence for theclass of interest.

A negative value WEi → the feature displays evidence against theclass.

10 / 29

Conditional sampling

A pixel depends most strongly on a small neighbourhood around it.

The conditional of a pixel given its neighbourhood does not dependon the position of the pixel in the image.

p(xi |x\i ) ≈ p(xi |x̂\i ) (4)

11 / 29

Multivariate Analysis

A neural network is relatively robust to the marginalization of just onefeature.

Remove several features at once

Connected pixels.

patches of size k × k .

12 / 29

Experiments

Conditional sampling

Red: For.

Blue: Against.

13 / 29

Experiments

Multivariate analysis

14 / 29

MRI data

15 / 29

MRI data

16 / 29

Conclusions

A method for visualizing deep neural networks by using a morepowerful conditional, multivariate model.

The visualization method shows which pixels of a specific input imageare evidence for or against a node in the network.

17 / 29

InfoGAN: Interpretable Representation Learning byInformation Maximizing Generative Adversarial Nets

Chen, Duan, Houthooft, Schulman, Sutskever, Abbeel, NIPS 2016

18 / 29

19 / 29

20 / 29

21 / 29

22 / 29

23 / 29

24 / 29

25 / 29

26 / 29

27 / 29

28 / 29

The End

29 / 29

Interpretability in Machine LearningInfoGAN: Interpretable Representation Learning by Information...

Documents

Transcript of Interpretability in Machine LearningInfoGAN: Interpretable Representation Learning by Information...

TOWARDS DEEP INTERPRETABILITY MUS ROVER II): LEARNING ...

7. Mario Houthooft : LinkID mario-houthooft-link id

Introduction Interpretability is a Red Herring: Grappling ...

Interpretability in Deep Learning for Heston Smile ...

Interpretability of machine learning‐based prediction ...

Interpretability - University of Colorado Boulder

Attacks Meet Interpretability: Attribute-steered …papers.nips.cc/paper/7998-attacks-meet-interpretability...Attacks Meet Interpretability: Attribute-steered Detection of Adversarial

Human Factors in Model Interpretability: Industry Practices, … · 2020-04-27 · 2.2 Limitations and Criticism of Model Interpretability As interpretability research grows, critiques

Interpretability of Delay-based Congestion

What is Interpretability?

Applicability and interpretability of Ward's hierarchical ...

i-Algebra: Towards Interactive Interpretability of Deep ...

Image Interpretability Evaluation Presentation

(UN-)INTERPRETABILITY IN EXPERT EVIDENCE: AN INQUIRY …

INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Evaluating the Interpretability of Generative Models by ...

Enhancing interpretability of seismic data with spectral ... · Enhancing interpretability of seismic data with spectral decomposition phase components . Satinder Chopra*+and Kurt

Semantic Structure and Interpretability of Word Embeddings

DeepCoDA: personalized interpretability for compositional health …proceedings.mlr.press/v119/quinn20a/quinn20a.pdf · 2021. 1. 28. · DeepCoDA: personalized interpretability for

Network Dissection: Quantifying Interpretability of Deep Visual … · 2017. 5. 31. · Network Dissection: Quantifying Interpretability of Deep Visual Representations David Bau∗,