Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

2018 Predictive Analytics Symposium

Session 29: Opening the Black Box: Understanding Complex Models

SOA Antitrust Compliance Guidelines SOA Presentation Disclaimer

Opening the Black Box:Understanding Complex ModelsSession 29September 2018 – Predictive Analytics Symposium

Michael Niemerg, FSA, MAAAPredictive Modeling Manager, Milliman IntelliScriptMichael.Niemerg@Milliman.com

SOA Antitrust Compliance GuidelinesActive participation in the Society of Actuaries is an important aspect of membership. While the positive contributions of professional societies and associations are well-recognized and encouraged, association activities are vulnerable to close antitrust scrutiny. By their very nature, associations bring together industry competitors and other market participants.

The United States antitrust laws aim to protect consumers by preserving the free economy and prohibiting anti-competitive business practices; they promote competition. There are both state and federal antitrust laws, although state antitrust laws closely follow federal law. The Sherman Act, is the primary U.S. antitrust law pertaining to association activities. The Sherman Act prohibits every contract, combination or conspiracy that places an unreasonable restraint on trade. There are, however, some activities that are illegal under all circumstances, such as price fixing, market allocation and collusive bidding.

There is no safe harbor under the antitrust law for professional association activities. Therefore, association meeting participants should refrain from discussing any activity that could potentially be construed as having an anti-competitive effect. Discussions relating to product or service pricing, market allocations, membership restrictions, product standardization or other conditions on trade could arguably be perceived as a restraint on trade and may expose the SOA and its members to antitrust enforcement procedures.

While participating in all SOA in person meetings, webinars, teleconferences or side discussions, you should avoid discussing competitively sensitive information with competitors and follow these guidelines:

• -Do not discuss prices for services or products or anything else that might affect prices• -Do not discuss what you or other entities plan to do in a particular geographic or product markets or with particular customers.• -Do not speak on behalf of the SOA or any of its committees unless specifically authorized to do so.• -Do leave a meeting where any anticompetitive pricing or market allocation discussion occurs.• -Do alert SOA staff and/or legal counsel to any concerning discussions• -Do consult with legal counsel before raising any matter or making a statement that may involve competitively sensitive information.

Adherence to these guidelines involves not only avoidance of antitrust violations, but avoidance of behavior which might be so construed. These guidelines only provide an overview of prohibited activities. SOA legal counsel reviews meeting agenda and materials as deemed appropriate and any discussion that departs from the formal agenda should be scrutinized carefully. Antitrust compliance is everyone’s responsibility; however, please seek legal counsel if you have any questions or concerns.

Limitations

This presentation is intended for informational purposes only. It reflects the opinions of the presenter, and does not represent any formal views held by Milliman, Inc. Milliman makes no representations or warranties regarding the contents of this presentation. Milliman does not intend to benefit or create a legal duty to any recipient of this presentation.

What is interpretability?

• How does the algorithm construct the model?• What features most influence the model’s predictions and by how much?• Does the relationship between each predictor and the response make sense?• How does the model extrapolate and interpolate?• Why did the model make a specific prediction?• How confident / sensitive is the model?• Is the model equitable? Is it discriminatory?

Warning: Contents hard to interpret

What makes interpretability difficult?

• Algorithmic complexity• High dimensionality• Interactions and correlation• Nonlinear relationships• Omitted variables• Noise variables

Interpretability Issues

• A priori vs. post hoc• Choosing an interpretable model form vs. using a “black-box”

• Global vs. local• Does the interpretation explain something about the entire model (global) or only a particular

instance (local)• Model-specific vs. model agnostic

• Is the interpretation method only applicable to certain algorithms or can it be applied to any algorithm?

• Interpretability vs. performance• Simpler models are often more interpretable at the expense of model performance

Interpretability vs. Performance

Some of the worst culprits…

• Random Forests• Gradient Boosted Decision Trees• Neural Networks• Ensembles

Methods

Model Agnostic•Partial Dependence Plots• ICE Plots•Variable Importance•Local interpretable model

explanation (LIME)•Visualization: t-SNE / MDS / PCA•Surrogate Models•Sensitivity Analysis•Shapley Predictions•Rule Extraction

Neural Networks•Saliency Masks•Activation Maximization•Relevance Propagation

Gradient Boosting•XGBFI•xgboostExplainer•Monotonicity constraints

Software

• iml (R)• LIME (R / Python)• SKATER (Python)• XGBFI (R - xgboost)• xgboostExplainer (R - xgboost)• DALEX (R)• H20 Driverless AI• Aequitas• Themis ML (Python)

Creating more interpretable models

Proprietary and Confidential for Client. Not for distribution.

• Simpler methods• Decision trees• Linear models

• Monotonicity constraints• Higher regularization

• Fewer parameters• Shallower tree models

dataCar

• Library: insuranceData• Target:

claimcst0 claim amount• Features:

•veh_value vehicle value, in $10,000’s•veh_body vehicle body•veh_age age of vehicle: 1, 2, 3, 4•gender gender of driver: M, F•area driver's area of residence: A, B, C, D, E, F•agecat driver's age category: 1, 2, 3 4, 5, 6

Let’s meet our data…

http://www.businessandeconomics.mq.edu.au/our_departments/Applied_Finance_and_Actuarial_Studies/research/books/GLMsforInsuranceData/data_sets

PDP and ICE Plots

Partial Dependence Plot (PDP)• Displays the marginal impact of a feature on the model – what’s happening with

“all else equal”• Shows the relationship between the target and the feature on average

• Fix the relationship of 1 or 2 predictors at multiple values of interest of interest• Average over the other variables• Plot response

Individual Conditional Expectation (ICE)• Shows how a single prediction changes when the value of a single feature is varied• Run this for multiple predictions and plot results

PDP and ICE Plots - Visualized

XGBoost Neural Network

veh_value veh_value

PDP with 2 features - Visualized

claim amount

veh_age

Surrogate Model

• A model trained using another models predictions as its target• Decision tree• Linear model

• Result is a simpler model that can help interpret the more complex model

Surrogate Model - Visualized

Feature Importance

• Measures how much a feature contributes to the predictive performance of the model

• Helps us know what is drives predictions at a global level • Common methods

• Permute a feature and measure change in model error• LOCO – Leave One Covariate Out - Build model with and without feature and compare

difference in error

Feature Importance - Visualized

Feature Importance

Shapley Predictions

• Provides a measure of local feature contribution for a given prediction• Basis in game theory

• Assigns “payout” to players in proportion to marginal contribution • “Game” is prediction of an observation

Shapley Visualization

Feature Importance

Local Surrogate Models (LIME)

Algorithm• Choose instances to explain• Permute instance to create replicated feature data• Weight permuted instances with the original based on proximity • Apply “black-box” machine learning model to predict outcomes of permuted data • Fit a simple model, explaining the complex model outcome with the selected

features from the permuted data weighted by its similarity to the original observation

• Explain predictions using this simpler model

LIME - Visualized

Sensitivity Analysis

• Thoroughly test the model for changes based upon small permutations in features

• Use simulated data representing prototypes for different areas of interest

XGBFI (XGBoost)

• Computes variable importance and interaction importance (“Gain”)

• Shows number of possible splits taken on a feature (“Fscore”) and the cut-points chosen

• & more!

Interaction Gain FScoreveh_value 4,259,983,149 1,911

area 1,211,945,038 878 veh_body 1,147,646,618 914 veh_age 1,088,228,059 709 agecat 806,955,407 610 gender 707,919,139 514

Interaction Gain FScoreveh_value|veh_value 5,970,120,855 1,198 veh_age|veh_value 1,562,875,549 252 agecat|veh_value 1,311,331,233 299

veh_body|veh_value 1,295,426,670 313 area|veh_value 1,100,576,093 327

gender|veh_value 880,025,508 245

XGBFI (XGBoost)

split value count split value count split value count split value count split value count split value count0.09 25 1.5 226 1.5 1 1.5 212 1.5 129 1.5 514

0.185 1 2.5 198 2.5 7 2.5 222 2.5 1340.205 1 3.5 178 3.5 58 3.5 275 3.5 1210.225 1 4.5 161 4.5 121 4.5 1160.23 4 5.5 115 5 1 5.5 110

0.245 2 5.5 470.25 2 6 15

0.265 1 6.5 290.285 6 7 80.295 5 7.5 640.305 6 8 20.315 1 8.5 180.325 14 9 700.33 3 9.5 16

0.345 14 10.5 1900.355 9 11.5 1320.36 1 12.5 135

genderveh_value area veh_body veh_age agecat

xgboostExplainer (XGBoost)

• Shows how each variable is locally contributing to a prediction

Monotonicity Constraints (XGBoost)

http://xgboost.readthedocs.io/en/latest/tutorials/monotonic.html

• Enforce a constraint on the model so that the predicted response can only increase / decrease for a given feature

Activation Maximization (Neural Networks)

https://www.researchgate.net/figure/Figure-S4-a-Previous-state-of-the-art-activation-maximization-algorithms-produce_fig9_301845946

• Find a prototype that most strongly correlates to a given prediction

Saliency Masks (Neural Networks)

• Determine what input is causing the prediction

Adversarial Examples

https://codewords.recurse.com/issues/five/why-do-neural-networks-think-a-panda-is-a-vulture

Model Fairness

Does the model discriminate against any protected classes?• Does the model directly incorporate protected classes into the model?• Can the model proxy protected classes through other variables in the model?

• Determine whether the protected classes can be statistically predicted using other data attributes (ex: logistic regression)

• Determine whether model outcomes are statistically different by protected class• Change either data or predictions to increase model fairness

Themis ML

“Themis ML is a Python library built on top of pandas and sklearn that implements fairness-aware machine learning algorithms” (https://github.com/cosmicBboy/themis-ml)

Aequitas

“An open source bias audit toolkit for machine learning developers, analysts, and policymakers to audit machine learning models for discrimination and bias, and make informed and equitable decisions around developing and deploying predictive risk-assessment tools” (https://dsapp.uchicago.edu/aequitas/)

Explainable Machine Learning Challenge

• Kaggle-like competition for model interpretability• Collaboration between Google, FICO and academics at Berkeley, Oxford, Imperial,

UC Irvine and MIT• Task: Use information in Home Equity Line of Credit (HELOC) to predict whether

someone will repay their HELOC account within 2 years. This prediction is then used to decide whether the homeowner qualifies for a line of credit and, if so, how much credit should be extended.

• https://community.fico.com/s/explainable-machine-learning-challenge

References

Interpretable Machine Learning: A Guide to Making Black Box Models Explainable https://christophm.github.io/interpretable-ml-book/ XGBoost: http://xgboost.readthedocs.io/en/latest/ G. Montavon, W. Samek, and K.-R. Muller. Methods for interpreting and

understanding deep neural networks. arXiv preprint arXiv:1706.07979, 2017. Z. C. Lipton. The mythos of model interpretability. arXiv preprint

arXiv:1606.03490, 2016. F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine

learning. arXiv preprint arXiv:1702.08608, 2017. A. Nguyen, J. Yosinski, and J. Clune, “Deep neural networks are easily fooled: High

confidence predictions for unrecognizable images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436, 2015.

Thank you

Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

Documents

Transcript of Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

Software Testing & Quality Assurance · Software Testing & Quality Assurance Overview The Psychology of Testing Testing terminology and techniques – i.e., black-box vs. white-box

HACKERS & ATTACK ANATOMY - SNIA · 2020-04-10 · II. Black Box vs. White Box ISE Proprietary Black Box 2 mo. / 200 hrs. 4 potential issues 1 confirmed none no recommendations very

Illuminating the Black Box: Antidepressants, Youth and Suicidemedia-ns.mghcpd.org.s3.amazonaws.com/...pm...the-black-box-present.pdf · Illuminating the Black Box: Antidepressants,

Software Testing. Recap Testing methods / Types –Black Box testing –White Box testing –Incremental / Thread testing Testing levels Vs testing methods.

Black Box TestingBlack Box Testing Black Box Testingox Testing

Black-box Testing Black-box Testing Categories Types of Black box testing Equivalence class Partitioning Equivalence class Partitioning Boundary Value.

Black Box Testing Methodology SANS.ppt · PDF fileWhite Box vs. Black Box Testing Delivery Application Implementation Protocol Specification Function ... Black box testing Report:

Black Box Testing

Black Nobility - Black Guelphs vs Ghibellines

Software Engineering I (02161)TDD vs. White box and Black box testing I TDD: Black box + white box testing I TDD starts with tests for the functionality (black box) I Any production

Black box & white-box testing technique

Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing

Black box vs equation-based modeling in Simulink€¦ · Running an Optimization 2 - 1 Black box vs equation-based modeling in Simulink Paolo Panarese Training Engineer - MathWorks

Black Box Testing Methodology SANS.ppt Box Testing... · Why Black Box Test? ! ... Why Black Box testing? ! Know what you are putting out on the ... White Box vs. Black Box Testing

black box testing.pdf

Black Box Checking

White Box and Black Box Testing

Black-box Tomography

Motivation Black-box model Approaches Summary Evaluating ... · Motivation Black-box model Approaches Summary Spectrum Mutant Artificial vs. real faults Replication Evaluation Failure

From a Small Library's Perspective - New Library Technology Paradigms: OS vs. Black Box vs. Hybrids