Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

Post on 30-May-2020

2 views 0 download

Transcript of Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

2018 Predictive Analytics Symposium

Session 29: Opening the Black Box: Understanding Complex Models

SOA Antitrust Compliance Guidelines SOA Presentation Disclaimer

Opening the Black Box:Understanding Complex ModelsSession 29September 2018 – Predictive Analytics Symposium

Michael Niemerg, FSA, MAAAPredictive Modeling Manager, Milliman IntelliScriptMichael.Niemerg@Milliman.com

SOA Antitrust Compliance GuidelinesActive participation in the Society of Actuaries is an important aspect of membership. While the positive contributions of professional societies and associations are well-recognized and encouraged, association activities are vulnerable to close antitrust scrutiny. By their very nature, associations bring together industry competitors and other market participants.

The United States antitrust laws aim to protect consumers by preserving the free economy and prohibiting anti-competitive business practices; they promote competition. There are both state and federal antitrust laws, although state antitrust laws closely follow federal law. The Sherman Act, is the primary U.S. antitrust law pertaining to association activities. The Sherman Act prohibits every contract, combination or conspiracy that places an unreasonable restraint on trade. There are, however, some activities that are illegal under all circumstances, such as price fixing, market allocation and collusive bidding.

There is no safe harbor under the antitrust law for professional association activities. Therefore, association meeting participants should refrain from discussing any activity that could potentially be construed as having an anti-competitive effect. Discussions relating to product or service pricing, market allocations, membership restrictions, product standardization or other conditions on trade could arguably be perceived as a restraint on trade and may expose the SOA and its members to antitrust enforcement procedures.

While participating in all SOA in person meetings, webinars, teleconferences or side discussions, you should avoid discussing competitively sensitive information with competitors and follow these guidelines:

• -Do not discuss prices for services or products or anything else that might affect prices• -Do not discuss what you or other entities plan to do in a particular geographic or product markets or with particular customers.• -Do not speak on behalf of the SOA or any of its committees unless specifically authorized to do so.• -Do leave a meeting where any anticompetitive pricing or market allocation discussion occurs.• -Do alert SOA staff and/or legal counsel to any concerning discussions• -Do consult with legal counsel before raising any matter or making a statement that may involve competitively sensitive information.

Adherence to these guidelines involves not only avoidance of antitrust violations, but avoidance of behavior which might be so construed. These guidelines only provide an overview of prohibited activities. SOA legal counsel reviews meeting agenda and materials as deemed appropriate and any discussion that departs from the formal agenda should be scrutinized carefully. Antitrust compliance is everyone’s responsibility; however, please seek legal counsel if you have any questions or concerns.

2

Limitations

3

This presentation is intended for informational purposes only. It reflects the opinions of the presenter, and does not represent any formal views held by Milliman, Inc. Milliman makes no representations or warranties regarding the contents of this presentation. Milliman does not intend to benefit or create a legal duty to any recipient of this presentation.

What is interpretability?

• How does the algorithm construct the model?• What features most influence the model’s predictions and by how much?• Does the relationship between each predictor and the response make sense?• How does the model extrapolate and interpolate?• Why did the model make a specific prediction?• How confident / sensitive is the model?• Is the model equitable? Is it discriminatory?

4

Warning: Contents hard to interpret

What makes interpretability difficult?

• Algorithmic complexity• High dimensionality• Interactions and correlation• Nonlinear relationships• Omitted variables• Noise variables

5

Interpretability Issues

• A priori vs. post hoc• Choosing an interpretable model form vs. using a “black-box”

• Global vs. local• Does the interpretation explain something about the entire model (global) or only a particular

instance (local)• Model-specific vs. model agnostic

• Is the interpretation method only applicable to certain algorithms or can it be applied to any algorithm?

• Interpretability vs. performance• Simpler models are often more interpretable at the expense of model performance

6

Interpretability vs. Performance

7

Some of the worst culprits…

• Random Forests• Gradient Boosted Decision Trees• Neural Networks• Ensembles

8

Methods

Model Agnostic•Partial Dependence Plots• ICE Plots•Variable Importance•Local interpretable model

explanation (LIME)•Visualization: t-SNE / MDS / PCA•Surrogate Models•Sensitivity Analysis•Shapley Predictions•Rule Extraction

Neural Networks•Saliency Masks•Activation Maximization•Relevance Propagation

Gradient Boosting•XGBFI•xgboostExplainer•Monotonicity constraints

9

Software

• iml (R)• LIME (R / Python)• SKATER (Python)• XGBFI (R - xgboost)• xgboostExplainer (R - xgboost)• DALEX (R)• H20 Driverless AI• Aequitas• Themis ML (Python)

10

Creating more interpretable models

Proprietary and Confidential for Client. Not for distribution.

• Simpler methods• Decision trees• Linear models

• Monotonicity constraints• Higher regularization

• Fewer parameters• Shallower tree models

dataCar

• Library: insuranceData• Target:

claimcst0 claim amount• Features:

•veh_value vehicle value, in $10,000’s•veh_body vehicle body•veh_age age of vehicle: 1, 2, 3, 4•gender gender of driver: M, F•area driver's area of residence: A, B, C, D, E, F•agecat driver's age category: 1, 2, 3 4, 5, 6

Let’s meet our data…

12

http://www.businessandeconomics.mq.edu.au/our_departments/Applied_Finance_and_Actuarial_Studies/research/books/GLMsforInsuranceData/data_sets

PDP and ICE Plots

13

Warning: Contents hard to interpret

Partial Dependence Plot (PDP)• Displays the marginal impact of a feature on the model – what’s happening with

“all else equal”• Shows the relationship between the target and the feature on average

• Fix the relationship of 1 or 2 predictors at multiple values of interest of interest• Average over the other variables• Plot response

Individual Conditional Expectation (ICE)• Shows how a single prediction changes when the value of a single feature is varied• Run this for multiple predictions and plot results

PDP and ICE Plots - Visualized

14

Warning: Contents hard to interpret

XGBoost Neural Network

clai

m a

mou

nt

clai

m a

mou

nt

veh_value veh_value

PDP with 2 features - Visualized

15

claim amount

veh_age

veh_

valu

e

Surrogate Model

16

Warning: Contents hard to interpret

• A model trained using another models predictions as its target• Decision tree• Linear model

• Result is a simpler model that can help interpret the more complex model

Surrogate Model - Visualized

17

Warning: Contents hard to interpret

Feature Importance

18

Warning: Contents hard to interpret

• Measures how much a feature contributes to the predictive performance of the model

• Helps us know what is drives predictions at a global level • Common methods

• Permute a feature and measure change in model error• LOCO – Leave One Covariate Out - Build model with and without feature and compare

difference in error

Feature Importance - Visualized

19

Warning: Contents hard to interpret

Feat

ure

Feature Importance

Shapley Predictions

20

Warning: Contents hard to interpret

• Provides a measure of local feature contribution for a given prediction• Basis in game theory

• Assigns “payout” to players in proportion to marginal contribution • “Game” is prediction of an observation

Shapley Visualization

21

Warning: Contents hard to interpret

Feature Importance

Feat

ure

Valu

e

Local Surrogate Models (LIME)

22

Warning: Contents hard to interpret

Algorithm• Choose instances to explain• Permute instance to create replicated feature data• Weight permuted instances with the original based on proximity • Apply “black-box” machine learning model to predict outcomes of permuted data • Fit a simple model, explaining the complex model outcome with the selected

features from the permuted data weighted by its similarity to the original observation

• Explain predictions using this simpler model

LIME - Visualized

23

Warning: Contents hard to interpret

Sensitivity Analysis

24

Warning: Contents hard to interpret

• Thoroughly test the model for changes based upon small permutations in features

• Use simulated data representing prototypes for different areas of interest

XGBFI (XGBoost)

25

Warning: Contents hard to interpret

• Computes variable importance and interaction importance (“Gain”)

• Shows number of possible splits taken on a feature (“Fscore”) and the cut-points chosen

• & more!

Interaction Gain FScoreveh_value 4,259,983,149 1,911

area 1,211,945,038 878 veh_body 1,147,646,618 914 veh_age 1,088,228,059 709 agecat 806,955,407 610 gender 707,919,139 514

Interaction Gain FScoreveh_value|veh_value 5,970,120,855 1,198 veh_age|veh_value 1,562,875,549 252 agecat|veh_value 1,311,331,233 299

veh_body|veh_value 1,295,426,670 313 area|veh_value 1,100,576,093 327

gender|veh_value 880,025,508 245

XGBFI (XGBoost)

26

Warning: Contents hard to interpret

split value count split value count split value count split value count split value count split value count0.09 25 1.5 226 1.5 1 1.5 212 1.5 129 1.5 514

0.185 1 2.5 198 2.5 7 2.5 222 2.5 1340.205 1 3.5 178 3.5 58 3.5 275 3.5 1210.225 1 4.5 161 4.5 121 4.5 1160.23 4 5.5 115 5 1 5.5 110

0.245 2 5.5 470.25 2 6 15

0.265 1 6.5 290.285 6 7 80.295 5 7.5 640.305 6 8 20.315 1 8.5 180.325 14 9 700.33 3 9.5 16

0.345 14 10.5 1900.355 9 11.5 1320.36 1 12.5 135

genderveh_value area veh_body veh_age agecat

xgboostExplainer (XGBoost)

27

Warning: Contents hard to interpret

• Shows how each variable is locally contributing to a prediction

Monotonicity Constraints (XGBoost)

28

http://xgboost.readthedocs.io/en/latest/tutorials/monotonic.html

• Enforce a constraint on the model so that the predicted response can only increase / decrease for a given feature

Activation Maximization (Neural Networks)

29

https://www.researchgate.net/figure/Figure-S4-a-Previous-state-of-the-art-activation-maximization-algorithms-produce_fig9_301845946

• Find a prototype that most strongly correlates to a given prediction

Saliency Masks (Neural Networks)

30

• Determine what input is causing the prediction

Adversarial Examples

31

https://codewords.recurse.com/issues/five/why-do-neural-networks-think-a-panda-is-a-vulture

Model Fairness

32

Does the model discriminate against any protected classes?• Does the model directly incorporate protected classes into the model?• Can the model proxy protected classes through other variables in the model?

• Determine whether the protected classes can be statistically predicted using other data attributes (ex: logistic regression)

• Determine whether model outcomes are statistically different by protected class• Change either data or predictions to increase model fairness

Themis ML

33

“Themis ML is a Python library built on top of pandas and sklearn that implements fairness-aware machine learning algorithms” (https://github.com/cosmicBboy/themis-ml)

Aequitas

34

“An open source bias audit toolkit for machine learning developers, analysts, and policymakers to audit machine learning models for discrimination and bias, and make informed and equitable decisions around developing and deploying predictive risk-assessment tools” (https://dsapp.uchicago.edu/aequitas/)

Explainable Machine Learning Challenge

35

• Kaggle-like competition for model interpretability• Collaboration between Google, FICO and academics at Berkeley, Oxford, Imperial,

UC Irvine and MIT• Task: Use information in Home Equity Line of Credit (HELOC) to predict whether

someone will repay their HELOC account within 2 years. This prediction is then used to decide whether the homeowner qualifies for a line of credit and, if so, how much credit should be extended.

• https://community.fico.com/s/explainable-machine-learning-challenge

References

36

Interpretable Machine Learning: A Guide to Making Black Box Models Explainable https://christophm.github.io/interpretable-ml-book/ XGBoost: http://xgboost.readthedocs.io/en/latest/ G. Montavon, W. Samek, and K.-R. Muller. Methods for interpreting and

understanding deep neural networks. arXiv preprint arXiv:1706.07979, 2017. Z. C. Lipton. The mythos of model interpretability. arXiv preprint

arXiv:1606.03490, 2016. F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine

learning. arXiv preprint arXiv:1702.08608, 2017. A. Nguyen, J. Yosinski, and J. Clune, “Deep neural networks are easily fooled: High

confidence predictions for unrecognizable images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436, 2015.

Thank you